Inspiration

The Inspiration for our project surrounds the motivation to use distributed computational networks to democratize how users can perform reinforcement learning. Considering the fact that reinforcement learning requires CUDA hardware we found this to be an extremely big barrier for people to partake in reinforcement learning applications. CUDA is a parallel computing platform and programming model developed by Nvidia for use with their GPUs (graphics processing units). CUDA allows developers to write software that can take advantage of the parallel processing power of Nvidia GPUs, which can lead to significant performance improvements for certain types of applications, such as machine learning and scientific simulations.

To use CUDA, you need a system with an Nvidia GPU that supports CUDA. The hardware requirements for running CUDA will depend on the specific application and workload. Some applications may require a high-end GPU with a lot of memory, while others may be able to run on a more modest GPU.

In general, if you are planning to use CUDA for computationally intensive tasks, it is recommended to have a GPU with a high number of CUDA cores and a large amount of memory. Additionally, depending on the size and complexity of the workload, it may be necessary to have a system with multiple GPUs or a cluster of GPUs to achieve the desired performance. With this possibility of fostering a community for shared computation and the interesting experiment of training a Mario AI we found this project to be extremely inspiring and motivating.

What it does

The project is a distributed computational network that uses reinforcement learning on multiple machines to play a segment of Super Mario Bros. From here our system uses an algorithm we developed called the piggy-back method to find a predetermined successful solution to completing the segment of the game.

How we built it

We used 4 components for the product. The distributed network was facilitated by the desktop app that allows for the calculation of individual segments to be maximized and then we will merge the following segments into one complete network, thus piggybacking off the previous information.

Challenges we ran into

One of the biggest challenges we faced was the limitations of the software we were using. In order to properly implement our idea of clustering GPUs to jointly process a machine learning algorithm we would ideally have more control over the problem itself. In this case, since we didn’t have full control over Mario’s environment it made our idea of segmenting the levels and attacking each partition from a different GPU essentially impossible.

Time was also a huge roadblock for us, as reinforcement learning algorithms need a lot of time to learn before they can achieve their goal.

Accomplishments that we're proud of

Some of the accomplishments that we were proud of were tackling the problem of machine learning for the very first time and learning how to use AWS EC2. In this short period we grasped major work flows in the data science community, using tools such as Tensor Flow, Pandas Numpy, Stable Baselines3 and a whole host of other technologies.

What we learned

In the last 24 hours we have learned how to implement reinforcement learning to develop sophisticated AI models that can learn to play a video game from the ground up. Beyond that though we have learned how to solve complex problems through the use of Machine Learning.

What's next for Mario Nexus

Making Mario run and jump was just a first step in Machine Learning for us. In the future we would like to apply similar AI models to real world problems. Given our application of the PiggyBack Method in a game theory context we would like to apply distributed computation to a whole host of low cost simulations.

Built With

Share this project:

Updates