Many people want to learn more about machine learning, and try running things hands-on, but don't have access to the necessary resources. On the other hand, many people with powerful GPUs do not use them continuously, and the sit idle some of the time. We wanted to bridge this gap.

What it does

GPUppy allows you to run or train your machine learning models on other peoples' powerful GPUs when they are idle.

How we built it

We built a distributed system to execute these machine learning workloads across a set of GPU-enabled devices. On the client side, we created wrapper scripts to package the context and source code and push it to a centralized scheduling server. The scheduling server pushes these tasks to the set of idle workers. The workers stream the output back as the task executes, using websockets for real-time communication. Additionally, the client uses rsync to pull the model artifacts off of the server.

Challenges we ran into

Getting everything to work correctly was difficult. This project had a lot of moving parts,

Accomplishments that we're proud of

Making it work relatively well.

What we learned

NVIDIA is... interesting

What's next for GPUppy

  • Use Sia for job and artifact storage
  • Use blockchain technology for more secure billing
Share this project: