Computer games are used to benchmark our progress on artificial intelligence. Games are dynamic environments requiring real-time response, problem solving, and sometimes the ability to make strategic decisions several time steps into the future. To make games more like the real world, we further require our algorithms to use vision to process all information from the screen pixels, just like a human, and to also use the same control inputs that humans use. In the last few years, 1 particular technique, Deep Reinforcement Learning, has been shown to be able to master many computer games, including most Atari Games, DOTA2, and StarCraft II. Reinforcement Learning (RL) is a class of algorithms that solve a Markov Decision Process. That is, the agent is given:
S: A set of all states the agent could encounter. In the case of computer games, S is all configurations of pixels that the game engine can render A: A set of all actions. In the case of computer games, this is all keyboard commands. T: A transition function that indicates the probability of transitioning from one state to another state if a particular action is chosen. We will assume that T is unknown. R: A reward function that computes a score for every state in S. In computer games, a reward is given either based on the in-game score, or for finishing a level, or achieveing an objective. A reinforcement learning agent learns a policy π(s) = a that indicates the optimal action a that the agent should take if it is in state s and acts optimally from that point forward in order to acquire the greatest expected reward. To learn the policy, RL agents enact a strategy of trial-and-error learning where the agent tries different actions in different states to see what gets it more reward.
In this project, I will implement a deep reinforcement learning agent that plays a platform game CoinRun. In CoinRun an agent must walk and jump to collect a coin at the end of the level.