Inspiration
We thought about how visually impaired people have a very limited perception range, less 2m from their body at all times. To gather information about new environments they would need to physically explore the space which is impractically and time-consuming to do.
What it does
The system consists of 2 cameras attached to a hat which the user wears. These cameras detect objects in the environment, acting as the user's visual guide. The user can ask "What's around?", and the system will respond with a description of the environment as seen by the 2 cameras. The user can ask "Where are the chairs?" and the system will respond with positions of chairs relative to the user.
How we built it
This project uses YOLOv8 to run object detection using two cameras. We used a pipeline of python classes to detect objects, filter bounding boxes, translate boxes into 3d space and play spacial audio to the user. Additionally, we build the camera mounts and the box containing the Nividia Jetson, running all of our code.
Challenges we ran into
The biggest challenge that we tackled during this hackathon was developing on the Nividia Jetson. Getting the permissions on the file systems and installing the correct dependencies is especially challenging, as we navigated with little documentation. We also spent considerable time planning out how to mount the cameras so they were stable for object detection and have no overlap.
Accomplishments that we're proud of
Translating the object detection into spacial audio is something that we are specifically proud of, having to integrate multiple different complex sub-systems.
What we learned
We once again learned how hard integration is, and how much there is to learn about linux!
Built With
- google-cloud
- opencv
- python
- ultralytics
Log in or sign up for Devpost to join the conversation.