Inspiration
"The Chosen One" - Animator vs Animation is an original YouTube video that went viral for animating a stickman that turned "intelligent". With today's level of multi-agent reinforcement learning such as OpenAI's multi-agent Hide and Seek, I believed that current technology is sufficient to develop intelligent behaviour.
Introduction Video - Animator vs Animation by Alan Becker
https://www.youtube.com/watch?v=npTC6b5-yvM&t=60s
What it does
A Game where the player can challenge the Chosen One. The player can use a gun to aim and shoot the Chosen One. The Chosen One can only dodge bullets at this moment in time.
Demonstration
Agent learnt to camp at a corner:
Agent learnt to dodge Gun bullets:
Method
- Trained over 1K epochs of 500 timesteps.
- Chosen One
- Model: 5 hidden layers, 100 hidden channels
- Reward function: +1 if surviving, -50 if hit
- Discount factor (g): 0.995
- Exploration policy: Epsilon-greedy (e = 0.3)
- Experience replay: Buffer (size = 10^6)
- Game Environment
- State dimension: (25,)
- Agent: jumps, xpos, ypos, touchingObst, gravityCurrent
- 5 Entities: x, y, speed, angle
- Action space: (3,)
- Agent: left, right, jump
- Generator: weapon_type, x, y, angle
How I built it
Created a PyGame environment from scratch, consisting of the following entities: Chosen One (agent), Gun (generator) and Bullets (projectiles). Built and trained a multi-agent reinforcement learning model, consisting of a minimax game between Chosen One (agent) versus Generator (Gun).
Main contributions
Traditional reinforcement learning face the difficulty of non-convergence of the agent due to difficult-to-plan curriculum. Using a generative model to play the Gun, the Generator progressively improves, allowing buffer for the Chosen One to improve. This results in a progressive autocurriculum that enables smooth learning of both the Chosen One and the Generator.
Accomplishments that I'm proud of
- Created Proof-of-Concept of Multi-Agent Reinforcement Learning enabling smooth learning.
- First time using PyGame to build a game.
What's next for The Chosen One - Multi-Agent Reinforcement Learning
- Increasing environment complexity - More weapons for Generator to choose and learn, more fine-grained movement mechanics for the Chosen One to master, possible attacks made BY the Chose One.
Final Outcome
This project shows that multi-agent reinforcement learning and progressive autocurricula is useful to enable smooth training of AI in complex environments.
References
- DDQN - Deep Reinforcement Learning with Double Q-learning
- ChainerRL - Deep Reinforcement Learning Library
- PyGame - Python Game Library
Built With
- generative-model
- machine-learning
- multi-agent
- pygame
- python
- reinforcement-learning
Log in or sign up for Devpost to join the conversation.