Inspiration
We were inspired to create Line of Sight to bridge the gap between simple navigation and true environmental awareness for visually-impaired individuals. While existing tools often focus solely on getting from point A to point B, we wanted to build a companion allows anyone visually impaired to have the freedom to explore cities independently with real time recommendations and guidance. We believe this application can help empower users to explore the world with greater confidence and independence.
What it does
Line of Sight is an intelligent visual assistant that "sees" the world for its user.
- Safety First: It uses real-time object detection to instantly alert users of immediate hazards like approaching cars or pedestrians.
- Environmental Context: The app captures live captures of the environment and uses Google's Gemini AI to provide a rich, conversational description of the surroundings
- Location Awareness: It identifies the top 5 nearest Points of Interest (POIs) using OpenStreetMap data, helping users discover nearby businesses and landmarks.
- Natural Interaction: All conversation information is leveraged by Fish Audio's text-to-speech API, making the experience feel like more personal to the user.
How we built it
We built Line of Sight as a cross-platform mobile application using Flutter.
- Vision & AI: We implemented YOLOv8 for high-speed, on-device object detection to ensure immediate safety alerts about dangers or obstacles. For deeper scene understanding, we integrated Google's Gemini API, which analyzes camera frames to generate conversational text.
- Location: We utilized the OpenStreetMap Overpass API to fetch real-time location data and filter for the most relevant nearby amenities.
- Audio: The text responses are synthesized into speech using the Fish Audio API, providing a high-quality auditory interface.
- Architecture: The app uses a split-screen interface to manage the camera feed and location data simultaneously, with a robust service layer handling API communications.
Challenges we ran into
- Camera & Layout Precision: One of our biggest technical hurdles was handling the camera aspect ratio. We had to debug issues where the camera feed was vertically stretched, requiring us to strictly enforce a specific aspect ratio and redesign the UI into a split-screen layout to accommodate both the visual feed and user accessible buttons.
- Data Overload: Filtering location data to be useful rather than overwhelming was also very challenging. We iterated on the logic to strictly limit the Points of Interest detected to the top 5 nearest locations to keep the audio guidance concise.
- Environment Configuration: Managing sensitive API keys across different services (Gemini, Fish Audio) required setting up a robust workflow to ensure that the keys were accessible to the Flutter app during local testing while preventing public leaks to the web.
Accomplishments that we're proud of
- Seamless AI Integration: We successfully combined OpenStreetMap, Gemini, and Fish Audio APIs into a single, cohesive real-time experience.
- Safety Priority System: We're proud of the logic that prioritizes safety alerts over general descriptions, ensuring that if an obstacle is detected, the user is warned immediately, interrupting any other audio.
- Clean Architecture: Despite the complexity, we maintained a clean codebase by refactoring our analysis options and consolidating our Git repositories, ensuring a scalable foundation for future features.
What we learned
- Mobile Hardware Constraints: We learned a lot about the intricacies of mobile camera APIs and how critical aspect ratios are for both computer vision accuracy and user experience.
- Asynchronous State Management: Balancing real-time video streams with asynchronous API calls for location and audio generation taught us valuable lessons in Flutter state management.
- Hybrid AI: We discovered that combining a fast, local model for safety with a slower, more powerful cloud model for context is the optimal approach for real-time assistive tech.
What's next for Line of Sight
- Cross-platform support: Application currently operates on Android but can be effectively ported to iOS through Flutter
- Personalized settings for the AI by changing the voice used by the AI Model and adjusting the level of description the AI uses to describe environment
- User can configure more types of points of interest like museums, parks, etc. to get personalized recommendations
- Accessibility Features like Haptics, Volume buttons to start/stop recording, etc.
Log in or sign up for Devpost to join the conversation.