MARL considers the problem of multiple agents acting within an environment. Multi-agent systems improve robustness to single-agent failure and many problems naturally lend themselves to multi-agent frameworks.
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
This repos below implement the state-of-the-art NMARL (networked MARL) algorithms for networked system control, with observability and communication of each agent limited to its neighborhood. For fair comparison, all algorithms are applied to A2C agents, classified into two groups: IA2C contains non-communicative policies which utilize neighborhood information only, whereas MA2C contains communicative policies with certain communication protocols.
https://github.com/cts198859/deeprl_network
https://github.com/instadeepai/EGTA-NMARL
This project will look at whether it is possible to backdoor a small portion of the agents (ideally only one) and affect the decision making of the whole MARL algorithm.