VisionVoyager

Inspiration

As a team passionate about bringing assistive technology to those in need, we decided to build our own "Pathfinder" Rover to augment human capabilities.

What it does

Introducing the VisionVoyager, a revolutionary assistive technology that incorporates GenerativeAI and Neural-Computer Interfaces to augment your vision, helping you and those with limited range-of-motion to explore new terrains and find your target objects hands-free. You may input text or an image of the desired target object, and our advanced multi-model machine learning model will help you efficiently find the object by using a portable camera built on a smart rover. Your friendly, smart rover can be controlled as easily as just blinks, so that you can explore new territories and be on your quest for your target object, wherever your mind takes you.

This product not only serves as a framework for augmenting human vision at any physical location, but more importantly, leverages the latest and greatest technology frontiers to break out of the body's physical limitation. This has numerous applications with enormous potential, from assistive technologies to medical imaging, from entertainment to space exploration.

How we built it

We are excited to share that we have built a multi-fronted project, leveraging a tech stack from GenerativeAI to Brain-computer interfacing electronics make this concept a reality! We started with the user's convenience and need in mind, and enabled both text and image input of the desired target object using state-of-the-art image to text models (CLIP). We researched and deployed an advanced GenerativeAI text-to-object-detection (GroundingDINO) model to quickly detect the desired object from a video feed. We considered and prototyped on many camera and microcontroller options, from the Leap Motion to Logitech webcams to ESP32-CAM, and from Arduino to Rasberry Pi to ESP32, and ultimately decided to use the ESP-32-CAM with Arduino for the smallest footprint and ease of integration. We research probability theory and leading biosignals literature to decide on optimal classification strategies for signals acquired from the EOG headset (MUSE headset), in order to operate the robot car as desired.

Challenges we ran into

trying and debugging multiple cameras, microcontrollers, and our rover
communication protocols for
researching and choosing GenerativeAI models to run
package install
venue wifi

Accomplishments that we're proud of

We are proud to have accomplished such a multi-domain project in the assistive technology realm, from researching and deploying the latest generative AI models, to integrating them with electronics such as a custom made, eye-controlled robot car and cameras on microcontrollers. We are also proud to have remained flexible, enjoyed the process, and pivoted our way into building a great product!

What we learned

Our team has had to pivot multiple times to different technologies based on challenges at development time. Along the way, we have learned a lot more about GenerativeAI architecture, Electronics, Microcontrollers, Robotic Control, Neural and Biosignal Interfacing, Probability Theory.

What's next for VisionVoyager

We will continue iterating to optimize the classification time, biosignal rover control consistency, and voyage to our vision of a world made better by assistive technologies.