Machine learning algorithms promise better situational awareness
Army researchers discovered a way to quickly get information to Soldiers in combat using new machine learning techniques. The algorithms will play a significant role in enhancing how the future military will operate.
Researchers from the U.S. Army’s Combat Capabilities Development Command’s Army Research Laboratory Defence Science and Technology Laboratory, IBM Thomas J. Watson Research Center and Pennsylvania State University, created the ability to train a number of classical machine learning algorithms to operate in constrained environments, particularly those involving coalitions, that can be implemented in various devices used by Soldiers.
“Tactical networks often suffer from intermittent and low-bandwidth connections due to their hostile operation environment,” said Dr. Ting He, associate professor at Pennsylvania State University. “In addition, although artificial intelligence techniques have the potential to greatly improve the situational awareness of Soldiers and commanders, to keep them updated about the fast-changing situations, the machine learning models need to be frequently retrained using updated data, which are often distributed across data sources with unreliable or poor connections.”
According to the researchers, this challenge calls for new generations of model training techniques that strike a desirable tradeoff between the quality of the obtained models and the amount of data transfer needed.
Their research, called coreset, tackles this challenge using the approach of a lossy data compression technique designed for machine learning applications. This compression method filters and discards needless and redundant data to reduce the amount of data being compressed.
“Coreset looks like a smaller version of the original dataset that can be used to train machine learning models with guaranteed approximation to the models trained on the original dataset,” He said. “However, existing coreset construction algorithms are each tailor-made to a targeted machine learning model, and thus multiple coresets need to be generated from the same dataset and transferred to a central location to train multiple models, offsetting the benefit of using coresets for data reduction.”
To address this problem, the researchers studied the robustness of different coreset construction algorithms with respect to the machine learning models they are used to training, with the goal of developing a robust coreset construction algorithm whose output can simultaneously support the training of multiple machine learning models with guaranteed qualities.
“Via a careful classification of more than 16 years of research on coresets, we identified three classes of coreset construction algorithms and evaluated the robustness of representative algorithms from each class on real datasets to obtain insights on what works better in a mixed-use setting and why,” He said. “Our study revealed that a clustering-based algorithm has outstanding robustness compared to the other evaluated algorithms in supporting both unsupervised and supervised learning.”
The researchers further established the theoretical condition under which the algorithm is guaranteed to provide a coreset, based on which near-optimal models can be obtained.
A distributed version of the algorithm was also developed with a very low communication overhead.
“Take the neural network as an example,” He said. “Compared to training the neural network on the raw data, training it on a coreset generated by our proposed algorithm can reduce the data transfer by more than 99% at only 8% loss of accuracy.”
According to Dr. Kevin Chan, an electronics engineer at the lab, this research will enhance the performance of machine learning algorithms, particularly in tactical environments where bandwidth is scarce.
“Given advanced techniques to increase the rate at which analytics can be updated, Soldiers will have access to updated and accurate analytics,” Chan said. “This research is crucial to Army Networking Priorities in support of machine learning that enable multi-domain operations, with direct applicability to the Army’s Network Modernization Priority.”
The developed algorithm is straightforward to implement and can be used with various data-capturing devices, especially high-volume, low-entropy devices such as surveillance cameras, to significantly reduce the amount of collected data while ensuring guaranteed near-optimal performance for a broad set of machine learning applications, He said.
This means that Soldiers will be able to obtain faster updates and smoother transitions as the situation changes at a competitive accuracy.
“In addition to applications in the military domain, coresets and distributed machine learning in general are also widely applicable in the commercial setting, where multiple organizations would like to jointly learn a model but cannot share all their data,” said Dr. Shiqiang Wang, research staff member at IBM Research and collaborator on this work. “This can be very useful for a wide range of AI-driven applications, such as fraud detection in the banking industry, disease diagnosis leveraging patient data across multiple hospitals and even autonomous driving. These emerging use cases enabled by distributed AI will be essential in our future society.”
As for the next steps for this research, the team is exploiting various ways of combining coreset construction with other data reduction techniques to achieve more aggressive data compression at a controllable loss of accuracy.
This article originally appeared on army.mil.com To read the full article and see the images, click here.
Nastel Technologies uses machine learning to detect anomalies, behavior and sentiment, accelerate decisions, satisfy customers, innovate continuously. To answer business-centric questions and provide actionable guidance for decision-makers, Nastel’s AutoPilot® for Analytics fuses:
- zAdvanced predictive anomaly detection, Bayesian Classification and other machine learning algorithms
- Raw information handling and analytics speed
- End-to-end business transaction tracking that spans technologies, tiers, and organizations
- Intuitive, easy-to-use data visualizations and dashboards