Inspiration
Inspired by the movie and book Hail Mary, this is Rocky's POV of earth. When I think of chaotic I think of Rocky, this is how I imagine he would behave when coming across different animals and plants on earth.
What it does
Rocky drops a snack from his box to encourage a friend to come closer to him. Once the friend approaches the computer the video and audio will be scanned and rocky will describe what he thinks he is seeing in his voice, with his energy, with his complete and total confidence that everything on Earth is amaze amaze amaze.
How we built it
Rocky's POV is a Swift macOS application. Video understanding comes from the Google Cloud Video Intelligence API, and narration is generated via OpenRouter (shoutout to Gemini for running out of credits). In the snack box there is an Arduino UNO and DC motor that spins only when there is no new friend in front of the screen. This was written in C++. Finally the Rocky voice is coming from open source TTS project: https://gist.github.com/pedramamini/fa5f6ef99dae79add220188419230642. Everything is stitched together through a Python bridge that connects the Swift app, the Arduino, and the voice synthesis.
Challenges we ran into
Getting a squirrel to approach my laptop. I spent a full hour in the park with a computer on the grass and peanuts in a cardboard box. Everyone thought I was crazy. The squirrels were suspicious. It turns out a cardboard box full of peanuts looks exactly like a trap, they were not interested in the snack giver.
Accomplishments that we're proud of
I made friends with a squirrel.
What we learned
Squirrels are smarter than they look and a Python bridge can hold together a surprisingly chaotic stack if you believe in it hard enough.
What's next for Rocky's POV
Bring it to iOS so Rocky can go anywhere, improve the snack box hardware so animals actually approach it (a less trap-looking design would help). Try Rocky out on other animals

Log in or sign up for Devpost to join the conversation.