Inspiration
We wanted to build something that would be impactful and would actually help people in the world. We also wanted to incorporate modern technologies such as AI to achieve things previously not possible or very difficult to do so.
What it does
Lumina has 2 main modes, first is Navigate. An intelligent computer vision algorithm for visually impaired persons that uses a convolutional neural network to analyse frame-by-frame video input of the users surroundings to provide simple and coherent speech feedback about the scene in front of the user and advise a visually impaired person on objects that may be blocking their path and how they can navigate around the scene to avoid obstacles. Navigate also utilises centroid tracking and Kalman filtering to perform inference on the scene in front of the user to warn them about potential future hazards. The second mode is Explain. A photo capturing service that again uses a convolutional neural network and Google Gemini to analyse a given scene and provide detailed feedback about exactly what is in front of the user.
How we built it
We used a Next.js frontend and a Python backend communicating via FastAPI to build our project. We then hosted the web app using ngrok to access the build on mobile devices.
Challenges we ran into
During the project we ran into a few issues, firstly being the text to speech algorithm not working correctly, sometimes providing duplicate outputs or just not working at all. We also ran into some hosting issues with Cloudfare and the Durham wifi network. One of our main challenges was calculating the distance of objects from the camera by only using a relative distance calculation algorithm.
Accomplishments that we're proud of
We are mainly proud of the very smooth and simple user interface and the gesture controls for navigating the website. We are also very proud of the object position prediction algorithm that uses linear algebra calculations on centroid positional vectors to predict the future location of objects and warn the user if they are going to cross their path
What we learned
We learned loads of information about text to speech algorithms, computer vision, convolutional neural networks and video feed analysis.

Log in or sign up for Devpost to join the conversation.