We wanted to work on a cool reinforcement learning project and get some practical experience with new Tensorflow Eager Execution environment that was released just recently.
What it does
It's an AI bot that was trained to interact with its environment using reinforcement learning from raw visual information (using only a CNN to read the information about the environment from the screen buffer). We mainly worked with "Take Cover" environment:
The purpose of this scenario is to teach agent to link incomming missles with his estimated lifespan. Agent should learn that being hit means health decrease and this in turn will lead to death which is undesirable. In effect agent should avoid missles.
REWARDS: +1 for each tic of life
How We built it
We started with trying to reproduce results of the following Jupyter Notebook. We then started testing different environments with different rules and reward functions. Finally, we re-wrote the deep learning model and training code using Eager Execution which made the code easier to understand and faster to run.
Challenges we ran into
There were quite a few challenges along the way:
We struggled for over 6 hours to set everything up before we could dive into the cool reinforcement learning stuff: we had to get a cloud instance with latest Tensorflow release, compatible CUDA drivers and a good GPU (RL is very compute intensive), build ViZDoom on it and come up with the best way to establish remote display with the server so that we can observe the training person.
Using Eager Execution was ambiguous at times since it is still a relatively new addition to Tensorflow,
ViZDoom and the actual Doom game (that we used for testing environments) are slightly different, so the agent would behave in a totally different way than what we expected.
What We learned
- Tensorflow Eager Execution
- The basics of reinforcement learning; deep Q-learning
- Action Code Script (Doom scripting language)