Title: Agent of Capitalism
Designing a simple agent to collect coins and attack opponents
Poster (with animated GIFs!):
https://docs.google.com/presentation/d/14aqBrEfeqA5ZnwtBVUfIzenN6ti2bm6B5Ak5Szxmobg/edit?usp=sharing
Final Reflection:
https://docs.google.com/document/d/1reU8fU0eTK9qTOVzGnplOtiSY_pbKGr022CAIsYuD2U/edit?usp=sharing
Runs (with enemy):
https://docs.google.com/document/d/1YrWdpqESJdfzKvV6yPFDiKcYOMl8j6ArDn3ogXZqYq8/edit?usp=sharing
Who:
Andrew Cooke - acooke1
Daniya Seitova - dseitova
Long Do - ldo6
Maxime Hendrikse Liu - mhendrik
Introduction:
We all agreed that ideally, we would implement reinforcement learning for our final project. After looking at a few different ideas, it seemed as if most previously existing games already had a reinforcement model designed for them. We thought that by creating our own game, we could not only make an entirely new reinforcement learning model but also a cool 2d game.
Our goal is to develop an agent that can find an optimized policy for collecting coins while avoiding its enemy. We will be training it on a fixed map initially, but if it succeeds on this, we hope to expand the project to train it to operate on random, procedurally-generated maps.
Related Work:
Gene, our mentor TA, recommended a paper called "Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning," linked here: https://arxiv.org/pdf/1904.08129.pdf
Rogue-Gym discussed a model that generated a simple single-player roguelike and trained reinforcement-learning agents to play it.
Data:
We will be doing reinforcement learning on a simple game in a 2D map. The data will be the actions during the agent's play-throughs of the given map, from start to finish (the finishing condition is when all the coins have been collected, or if the player fails to avoid the enemy).
Methodology:
Because we are still learning about Reinforcement Learning in class, this methodology may change as we learn more.
For now, we expect to experiment between using Deep Q-Learning or a REINFORCE policy network to determine what actions the agent will take.
Metrics:
Our base goal is to have an agent that can move about the map and collect coins—we will test the model on several maps of escalating size and complexity.
Our target goal is to have the agent effectively collecting all the coins in the map while avoiding or attacking the enemy, which will always move towards the player.
A stretch goal is to train the agent to collect coins and avoid the enemy on randomly procedurally-generated maps, rather than just a fixed map.
Ethics:
What implications does this project have, beyond the 2D game?
One interesting ethical question surrounding video games is whether they promote violence.
We have discussed training the agent first with only the options to move around the map (thus forcing it to learn to avoid the enemy rather than attacking it). We are interested to see, if we then provide the agent with the option to attack the enemy, whether the optimized approach will then include attacking the enemy in order to more quickly collect the coins. If this is the case, it may be true that including options to solve problems violently in video games may normalize these approaches beyond the game.
Why is Deep Learning an interesting approach to this problem?
By studying how the agent will train to use the actions provided to it, we can observe how a deep learning agent may optimize its behavior on other problems. If given the option to attack, will the model use it? The answer to this question creates implications for why we should limit the actions available to other deep learning agents.
Division of Labor:
Long and Maxime will work on developing the game API—designing the map(s) for training and coding how, given an action, the game will generate a game state to return to the deep learning network.
Andrew and Daniya will develop the deep learning network, determining how to take in a game state, pass it through a reinforcement learning framework, and return the optimal next action.
Built With
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.