Inspiration
We have seen a lot of cool demos with generative AI, but we see most of those around us. We were promised robot butlers and drivers and cooks, but are not here yet. We think that time has come. Comma bodies (the head on a hoverboard we are using for this hackathon) and other such embodied bots are the future!
What it does
Billy is an intelligent robot. For this hackathon, it is a photographer and DJ. It understands natural language speech, interprets it, decided what the most relevant action is, and performs that action.
How we built it
The robot we used is a commabody (sold by comma.ai). At the beginning of the hackathon, it could stand. We used the camera to process the scene, and run a state of the art yolo model on the video stream. We detect humans, plan a path to go to a human and stop in front of them. Then, we listen to instructions from the human, use openai whisper to transcribe it. Then we use openai GPT-3.5 to convert the text to a bunch of relevant actions. When the user asks for a photograph, we compose the scene and take a picture. We then send this picture over the AirPrint protocol to a canon printer, and give you a physical photo! IF the user asks for a song, Billy plays a song and dances with you! Throughout the entire encounter, Billy converses naturally using the elevenlabs.
Challenges we ran into
We ran into a whole bunch of problems, trying to get the different conversational components integrated into the physical movement of the body. We had to do a lot of work for the human detection to run at 20FPS, which is an engineering challenge.
Accomplishments that we're proud of
We are proud of making a relatively useful hack using physical robotics and generative ai. As far as we know, not many projects have done this. We plan to actually use billy as a photographer in an event later this month.
What we learned
We learnt a great deal about prompt engineering, audio processing, real-time robotics etc.
What's next for My Friend Billy
We plan to actually use billy as a photographer in an event later this month. We want to build a great developer experience for robotics development.
Log in or sign up for Devpost to join the conversation.