Visionaria

Visionaria components layed out
Skyla with the Visionaria setup.
Aidan with the Visionaria setup.

Inspiration

We were inspired to create Visionaria after witnessing the way Toph Beifong was able to see by feeling vibrations even though she was blind.

What it does

Visionaria uses OpenAI's GPT-4o model to analyze the view in front of the user and describe it back to them even allowing for a special request, i.e., asking it where a specific object is.

How we built it

We built Visionaria using a Raspberry Pi, microphone, webcam, and headset. The code on the Raspberry Pi listens on the microphone for the user to say the keyword "Jarvis" then takes a picture using the webcam and analyzes it.

Challenges we ran into

We had some trouble where the model would get confused when asked certain questions about the image but we were able to fix this through prompt engineering.