Model backdoors in Multi-Agent RL

MARL considers the problem of multiple agents acting within an environment. Multi-agent systems improve robustness to single-agent failure and many problems naturally lend themselves to multi-agent frameworks.

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

This repos below implement the state-of-the-art NMARL (networked MARL) algorithms for networked system control, with observability and communication of each agent limited to its neighborhood. For fair comparison, all algorithms are applied to A2C agents, classified into two groups: IA2C contains non-communicative policies which utilize neighborhood information only, whereas MA2C contains communicative policies with certain communication protocols.

https://github.com/cts198859/deeprl_network

https://github.com/instadeepai/EGTA-NMARL

This project will look at whether it is possible to backdoor a small portion of the agents (ideally only one) and affect the decision making of the whole MARL algorithm.