We both wanted to learn some machine learning, in particular reinforcement learning, so we created a simulation of an agent running away from a skeleton to protect a briefcase of sensitive data!
What it does
You're in an arena, with a skeleton and a safe (use your imagination, you're the green blob, the skeleton is red and the safe is blue). The aim is to deposit your briefcase into the safe, before the skeleton gets you!
How we built it
We implemented reinforcement learning with a Q-learning algorithm, with 10,000 possible states and 4 actions to control the movement of the agent. We ran the algorithm over 10,000 episodes, in order to see the improvement of the model; our metric was the number of successfully deposited briefcases. We created a three.js environment to showcase the model after training.
Challenges we ran into
We started by using Tensorflow, however, we did not need such a complicated neural network with many layers, as a Q-table of values would be much faster and possibly learn quicker.
Accomplishments that we're proud of
We're proud that we successfully managed to implement the Q-learning algorithm and ended up with an agent which was much more successful than a random agent!
What we learned
We were both new to reinforcement learning so it was a great experience learning from scratch how to implement the Q-learning algorithm and play around with Tensorflow/Keras.
What's next for Daver
We'd like to run the training for longer (more like 1,000,000 episodes) and see how we can improve our model. Also the three.js scene definitely needs more spookiness!