OpenSight

Inspiration (The problem)

Screens and software are built for keyboard + mouse users ONLY
Existing screen readers are slow and inefficient for real tasks that visually impaired people need
Most systems force users to learn complex shortcuts instead of using natural interaction
Current voice assistants are fragmented and can’t complete full workflows
Goal: reduce the gap between human intent → computer action

Built a system of 5 working AI agents that handle different tasks:
- Shopping Agent - searches and compares products
- Research Agent - pulls web info and summarizes it
- Calendar Agent - reads and manages Google Calendar events
- General Agent - handles normal conversation
- Router Agent - decides which agent should respond
Implemented a routing system (the brain)
- Takes user input → classifies intent → sends task to respective agent
Built system so tasks are not handled by one model
- Designed so specialized components are working together which is where existing solutions fail

Users interact entirely through natural speech → no UI learning curve
Built a live visual feedback system:
- Shows what the agent is doing in real time
- Displays navigation, cursor movement, and actions taken
- Shows system reasoning/decision flow
Tested with real accessibility context:
- Worked with TLOS (Technology-Enhanced Learning and Online Strategies)
- Connected with Disability Alliance and Caucus
  - Tentatively working with DisCoTec, the Disability Community Technology Center
  - Tentatively working with Andrew Begel's lab VariAbility at Carnegie Mellon
  - Tentatively working with disability studies professors Ashley Shew

Keeping multiple agents coordinated without conflicts
- Ensuring tasks were delegated to their respective agent
Maintaining context across long conversations
- Designing a system where each agent held their own context and stored relevant information
Making routing decisions fast enough to be usable

Built a fully working multi-agent voice system → which is where many existing solutions struggle to accomplish
Achieved real-time action visualization (not just chat output)
Created a system that has advanced technical capabilities with decisions & responses
Designing the system for maximum growth and scalability

Multi-agent systems are powerful but require strong orchestration and edge case testing
Routing is just as important as model capability
Real-time feedback dramatically shifts the development direction
Accessibility-first design changes how you think about UX