Inspiration
I am a grad student doing research in Brown University's Humans 2 Robots lab. My research is training dexterous robot hands to learn complex manipulation skills from human demonstrations. Currently, the it is less the physical constraints of the robot that is holding it back from exhibiting consistent intelligent behavior, it is the scale and variety of the data the robot is trained on. In comes Pepper.
What it does
Pepper matches Robotics companies to the only resource that could possibly provide the scale they need: The Public. Robotics companies can publish "Bounties" for demonstrations they need, and users can get paid for uploading videos of them performing it. These can be anything from chopping onions to changing a tire.
How we built it
We focused our engineering efforts on ensuring we executed two things exceptionally well 1) Quality Assurance. We built a custom VLM pipeline to verify the user uploaded videos are consistent with the task at hand. This ensures that only high quality data is available for training. 2) Augmentation. We want to make the most out of the data we have. Once validated, we generate semantic segmentations, depth maps, and 3D hand meshes. We also multiply the number of demonstrations we receive by applying scene consistent visual randomizations with NVIDIA's COSMOs world models.
Challenges we ran into
With such a feature rich platform, we had. alot of tough decisions to make in terms of scope and priority. However, with a laser focus on our main differentiators we have ultimately produced a spiky MVP.
Accomplishments that we're proud of
Our back end inference pipelines engineered together bespoke productionalizations of multiple bleeding edge models into a sound, reliable, and scalable framework. This was no small feat of engineering and we are exceptionally proud of our result.
What we learned
Focusing on your winning areas results in a much cooler product than packing in features.
What's next for Pepper
Pepper is ready to build. Fast. We are actively working to scale up our infrastructure to support more customers, and are seeking our marketing affiliations with first person POV social media creators on TikTok and Instagram. These creators have already proven our thesis that people can make money off of recordings of their day to day life.
Built With
- amazon-web-services
- fastapi
- nextjs
- paperspace
- python
- railway
- saturncloud
- vercel

Log in or sign up for Devpost to join the conversation.