Army Researchers Advance Drone Swarm Learning Capabilities

U.S. Army researchers have developed a reinforcement learning approach that will allow swarms of unmanned aerial and ground vehicles more consistent performance when executing mission objectives.

Reinforcement learning provides a way to control uncertain agents to achieve multi-objective goals when the precise model for the agent is unavailable. However, existing reinforcement learning methods can only be applied in a centralized manner, which requires pooling the state information of the entire swarm at a central learner, which drastically increases computational complexity and communication requirements, resulting in unreasonable learning time, Jermin George of the U.S. Army Combat Capabilities Development Command’s Army Research Lab said.

A small unmanned Clearpath Husky robot, which was used by ARL researchers to develop a new technique to quickly teach robots novel traversal behaviors with minimal human oversight. Courtesy of US Army.

A small unmanned Clearpath Husky robot, which was used by Army Research Lab researchers to develop a new technique to quickly teach robots novel traversal behaviors with minimal human oversight. Courtesy of U.S. Army.

To solve this, the researchers collaborated with Aranya Chakrabortty from North Carolina State University and He Bai of Oklahoma State University. The goal of the collaboration was to develop a theoretical foundation for data-driven control for large-scale swarm networks, where control actions are taken based on low-dimensional measurement data instead of dynamic models.

The result is an approach called hierarchical reinforcement learning (HRL), and it decomposes the global control objective into multiple hierarchies — namely, multiple small group-level microscopic control, and a broad swarm-level macroscopic control.

“Each hierarchy has its own learning loop with respective local and global reward functions,” George said. “We were able to significantly reduce the learning time by running these learning loops in parallel.”

Sheetak - Cooling at your Fingertip 5/24 MR

According to George, online reinforcement learning control of swarm boils down to solving a large-scale algebraic matrix Riccati equation using system, or swarm, input-output data.

The initial approach to solving the large-scale matrix Riccati equation was to divide the swarm into multiple smaller groups and implement group-level reinforcement learning in parallel while executing a global reinforcement learning on a smaller-dimensional compressed state from each group.

The current HRL scheme uses a decupling mechanism that allows the team to hierarchically approximate a solution to the large-scale matrix equation by first solving the local reinforcement learning problem and then synthesizing the global control from local controllers (by solving a least squares problem) instead of running a global reinforcement learning on the aggregated state. This further reduces learning time.

Army researchers envision a hierarchical control for ground vehicle and air vehicle coordination. Courtesy of U.S. Army.

Experiments have shown that compared to a centralized approach, HRL was able to reduce the learning time by 80% while limiting the optimality loss to 5%.

“Our current HRL efforts will allow us to develop control policies for swarms of unmanned aerial and ground vehicles so that they can optimally accomplish different mission sets even though the individual dynamics for the swarming agents are unknown,” George said.

The team is working to further develop its HRL control scheme by considering optimal grouping of agents in the swarm to minimize computation and communication complexity while limiting the optimality gap.

The researchers are also investigating the use of deep recurrent neural networks to learn and predict the best grouping patterns and the application of developed techniques for optimal coordination of autonomous air and ground vehicles in multi-domain operations in dense urban terrain.

Published: August 2020

Glossary

machine vision: Machine vision, also known as computer vision or computer sight, refers to the technology that enables machines, typically computers, to interpret and understand visual information from the world, much like the human visual system. It involves the development and application of algorithms and systems that allow machines to acquire, process, analyze, and make decisions based on visual data. Key aspects of machine vision include: Image acquisition: Machine vision systems use various...
neural network: A computing paradigm that attempts to process information in a manner similar to that of the brain; it differs from artificial intelligence in that it relies not on pre-programming but on the acquisition and evolution of interconnections between nodes. These computational models have shown extensive usage in applications that involve pattern recognition as well as machine learning as the interconnections between nodes continue to compute updated values from previous inputs.

Browse Cameras & Imaging, Lasers, Optical Components, Test & Measurement, and more.