Inspiration Multi-agent systems like robot swarms and self-driving cars struggle to coordinate without sharing information. We wanted to test if smart AI reward learning could solve coordination problems without needing centralized control. What it does DMARL-RSA lets each AI agent learn its own personalized rewards to guide its behavior. We tested if this decentralized approach could match traditional methods that use global information during training. How we built it Built three different AI training methods using PyTorch and tested them on cooperative navigation tasks. Agents trained for 5,000 episodes to learn how to cover landmarks while avoiding collisions. Each decentralized agent got its own reward learning network to discover what actions work best. Challenges we ran into Decentralized agents performed way worse than expected—26 points lower than centralized training. Discovered a weird paradox: agents were great at local tasks (covering landmarks) but terrible at team performance. Training was unstable because all agents were learning simultaneously. Accomplishments that we're proud of First research to prove that decentralized reward learning has fundamental limits in team coordination. Found strong evidence that sophisticated learning alone can't fix coordination problems. Results were statistically significant and consistent across all experiments. What we learned You can't outsmart the need for coordination—clever reward tricks don't replace information sharing. Local success doesn't equal team success in multi-agent systems. Sometimes you need centralized training for good teamwork, no shortcuts around it. What's next for Research Paper Test hybrid approaches that mix some centralized coordination with decentralized execution. Try more complex environments like StarCraft battles and robot coordination. Submit to top AI conferences (NeurIPS/ICML). Explore when decentralization works vs when it fails.

Built With

Share this project:

Updates