Reinforcement learning for multi-agent systems
RL is a data-driven approach to determining optimal policies in the presence of unknown stochastic dynamics. RL has recently seen a resurgence in optimal control and decision making for dynamical systems based on adaptive dynamic programming, Q-learning, and actor-critic methods. However, when applied to a MAS, RL faces challenges on the curse of dimensionality and learning efficiency. Targeting these challenges, we have investigated strategies that learn coordination policies effectively and efficiently by exploiting structures in a MAS, including
a) time-scale separation in clustered networks, such as power networks,
b) a hierarchical structure driven by global and local reward functions, such as the multi-drone multi-target tracking application below (click the picture to redirect to the youtube video).
Currently, we are investigating a graph-based multi-agent reinforcement learning (MARL) problem that specify topological connections between the agents. Specifically, a state graph, an observation graph, and a reward graph characterize the coupling between the agent dynamics, the constraints in the agents’ observations, and the dependency of the agents’ rewards on others, respectively. We exploit the graph structures to decompose the learning process without approximation and find that the variance in the policy gradient estimates can be greatly reduced, leading to faster convergence and better sample complexity and scalability. The figure below shows the comparison of our algorithms “MAStAC” (multi-agent structured actor-critic) compared with other baseline algorithms for a 40-zone temperature control problem.

Relevant Publications
2024
-
Distributed Multi-Agent Reinforcement Learning Based on Graph-Induced Local Value FunctionsIEEE Transactions on Automatic Control 2024
-
Asynchronous distributed reinforcement learning for lqr control via zeroth-order block coordinate descentIEEE Transactions on Automatic Control 2024
2021
-
Model-Free Optimal Control of Linear Multiagent Systems via Decomposition and Hierarchical ApproximationIEEE Transactions on Control of Network Systems 2021
-
Scalable designs for reinforcement learning-based wide-area damping controlIEEE Transactions on Smart Grid 2021
-
Learning distributed stabilizing controllers for multi-agent systemsIEEE Control Systems Letters 2021