posted an update

  1. AI Architecture Explanation:

Inputs: Real-time video stream from the user's camera. AI Capability: Object detection (Computer Vision). Processing: The system utilizes the YOLOv8 model to analyze video frames, identifying people and common objects in real-time. Outputs: The system provides immediate, clear audio feedback (using pyttsx3) to the user, announcing the presence of people or objects to improve situational awareness for visually impaired individuals.

  1. Human-in-the-Loop Decision:

The Decision: The AI identifies objects and people, but it does NOT decide on the user's navigation path or movement. Why Human Involvement is Critical: The decision to move in a certain direction or navigate an environment is a life-critical choice. The AI acts only as an information assistant. The visually impaired user must retain full agency to evaluate the AI's feedback against the physical context, ensuring that all navigation decisions are made by the human, not the algorithm.

  1. Responsible AI Guardrail:

Risk: Over-reliance (The risk that a user might trust the AI output completely and ignore potential hazards that the AI might miss). Mitigation: We implemented a confidence score threshold to reduce false positives. Furthermore, the system is designed to provide situational awareness rather than navigation instructions. We emphasize to users that this tool is a secondary aid, not a substitute for standard mobility aids like white canes or guide dogs. AI Tools & Data

  1. AI Tools Used:

YOLOv8 (Ultralytics): Pre-trained model for real-time object detection (Free/Open Source).

pyttsx3: Offline text-to-speech engine (Free/Open Source).

OpenCV: For video stream capture and processing (Free).

Python: Core programming language (Free).

We utilized pre-trained weights for YOLOv8 based on the standard COCO dataset, which is a massive, publicly available image dataset. No private or sensitive user data was collected or used.

Log in or sign up for Devpost to join the conversation.