Inspiration
1.3 billion people worldwide live with some form of disability (WHO). Yet 97% of the top 1 million websites fail basic WCAG accessibility standards. Screen readers struggle with unlabeled images, custom widgets break keyboard navigation, and forms remain unusable for people with motor disabilities.
The traditional fix — asking every website owner to retrofit their code — is slow, expensive, and still leaves billions of pages behind. We asked a different question: what if AI could make any website accessible instantly, without changing a single line of the site's code?
Amazon Nova gave us the tools to answer that question. Four distinct AI services, each solving a different dimension of web accessibility, coordinated by a single orchestrator. That's NovaAccess.
What it does
NovaAccess is an AI-powered accessibility layer that sits on top of any website and makes it usable for people with visual, motor, and cognitive disabilities. It runs 4 specialized AI agents, each powered by a different Amazon Nova service, coordinated by a central Orchestrator:
| Agent | Nova Service | What It Does |
|---|---|---|
| Voice Navigator | Nova 2 Sonic | Full-duplex speech-to-speech web browsing — users speak commands and hear natural responses in real time via bidirectional WebSocket streaming |
| Page Explorer | Nova Act | Automates UI interactions — fills forms, clicks buttons, and navigates pages on behalf of users with motor disabilities |
| Content Interpreter | Nova 2 Lite | Runs WCAG compliance audits, generates alt-text for images, describes page layouts, and simplifies complex content for cognitive accessibility |
| Semantic Indexer | Nova Embeddings | Enables natural language element search — "Where is the login button?" finds UI elements using 1024-dimensional vector similarity |
Key features:
- Instant Accessibility Audit: Enter any URL and get a 0–100 AI accessibility score with detailed WCAG violation reports, severity breakdowns, and actionable fix recommendations
- Voice-First Web Browsing: Browse websites entirely by voice — "Go to amazon.com and find headphones under $50" — with real-time speech-to-speech powered by Nova 2 Sonic's bidirectional streaming
- AI-Powered Automation: Nova Act fills forms and clicks buttons for users who can't use a mouse, with a visual mock browser showing each step as it happens
- Semantic Element Search: Find any page element by describing it naturally — powered by Nova Embeddings vector similarity
- Multi-Agent Dashboard: Watch all 4 agents coordinate in real time, with status tracking, activity logs, and performance metrics
- Graceful Degradation: Three-tier fallback (Nova API → AI-powered fallback → demo simulation) ensures the app works for evaluation without any AWS setup
How we built it
Architecture — Multi-Agent Orchestration:
We designed an event-driven multi-agent system where each agent has a distinct responsibility. The Orchestrator (src/lib/agents/orchestrator.ts) receives user input (voice or text), classifies the intent into one of 11 categories (navigate, read, click, fill, search, describe, analyze, help, back, scroll, unknown), and routes to the appropriate agent(s). For compound tasks like "analyze this page for accessibility," the Orchestrator triggers both the Content Interpreter and Semantic Indexer in parallel.
Frontend — Next.js 16 + React 19:
Built with the App Router, Server Components by default, and Client Components only for interactive elements (voice input, agent panels, mock browser). The dashboard uses a route group layout ((dashboard)/) with a persistent sidebar. UI components from shadcn/ui with Tailwind CSS 4, featuring animated score rings, real-time agent status panels, and a mock browser component that visually demonstrates automation steps.
Nova 2 Sonic — Voice Navigator:
We built a dedicated WebSocket bridge server (sonic-server.ts) that connects browser microphones to Nova 2 Sonic's InvokeModelWithBidirectionalStream API. The server handles:
- Bidirectional audio streaming (16kHz PCM input, 24kHz LPCM output)
- A custom
fetch_webpagetool that lets Sonic read actual page content on demand — when a user asks about a page, Sonic calls this tool, the server fetches the HTML, extracts text, and returns it as a tool result - Page context injection — the current URL's content is included in the system prompt so Sonic can answer questions about the page
- Session management with automatic reconnection for the 8-minute WebSocket limit
- Concurrent client support (up to 20 simultaneous streams)
Nova 2 Lite — Content Interpreter: Called via the Bedrock Converse API for four distinct tasks: (1) WCAG accessibility auditing with structured JSON output (score, issues by severity, element selectors, auto-fix suggestions), (2) natural language page descriptions for visually impaired users, (3) alt-text generation under 125 characters following WCAG guidelines, and (4) content simplification to ~6th-grade reading level for cognitive accessibility.
Nova Embeddings — Semantic Indexer:
We extract text elements from web pages (headings, paragraphs, links, buttons, labels), generate 1024-dimensional embeddings via InvokeModel with the schemaVersion/taskType/singleEmbeddingParams schema, and store them in an in-memory vector index. Users can then search page content with natural language using cosine similarity — "Where is the search bar?" returns the most semantically relevant element.
Nova Act — Page Explorer: Integrated via the Nova Act Python SDK as a sidecar service. The agent receives natural language instructions ("fill in the email field with john@example.com") and executes them as browser automation workflows. When Nova Act is unavailable, a smart fallback uses Nova 2 Lite to analyze the page and generate automation steps, ensuring the feature always works.
Resilience — Three-Tier Fallback: Every agent implements three levels of degradation: (1) Real Nova API call, (2) AI-powered fallback using Nova 2 Lite for intelligent simulation, (3) Hardcoded demo responses. This means the app produces realistic, functional output even without AWS credentials — critical for judge evaluation.
Challenges we ran into
- Nova 2 Sonic bidirectional streaming — Building the WebSocket bridge server that correctly multiplexes audio streams between browser clients and Bedrock required careful handling of binary/base64 audio encoding, event sequencing (sessionStart → promptStart → systemPrompt → audioInput), and concurrent connection management
- Sonic tool use integration — Making Nova 2 Sonic call a
fetch_webpagetool to read actual page content during voice conversations required implementing the full tool-use protocol (toolUse event → fetch page → toolResult response) within the bidirectional stream - Nova Act being Python-only while our app is TypeScript — solved with a FastAPI sidecar service bridged via HTTP, plus a smart Nova 2 Lite fallback that generates automation steps from page analysis
- Nova Embeddings request format — Uses a unique schema (
schemaVersion,taskType,singleEmbeddingParams) that differs from other Bedrock models, requiring careful documentation review - Coordinating 4 agents with different response times — the Orchestrator needed event-driven design with status streaming so the UI stays responsive while slower agents (like Nova Act browser automation) complete their work
- Making accessibility accessible — Ensuring our own tool meets WCAG standards (skip-to-content links, keyboard navigation, ARIA labels, accessible color contrasts) while helping users evaluate other sites
Accomplishments that we're proud of
- Deep integration of ALL 4 Amazon Nova services — not just checking boxes, but each service solves a fundamentally different accessibility challenge (voice, automation, reasoning, search)
- Nova 2 Sonic with tool use — Our voice navigator can actually read web pages during conversation by calling a fetch_webpage tool through Sonic's bidirectional stream, enabling truly informed voice-based web browsing
- Multi-agent orchestration with intent classification across 11 command types, routing to the right combination of agents
- Three-tier fallback system — Every feature works in demo mode without AWS credentials, with realistic AI-powered simulation, making it easy for judges to evaluate the full UX without any setup
- Mock browser component — Visual automation demos with animated cursors, typing effects, and page transitions that show exactly what Nova Act does step-by-step
- Production-quality UI — Dark mode design with animated score rings, real-time agent status panels, streaming responses, and responsive layouts
- Real community impact — Addressing a problem affecting 1.3 billion people globally, with a solution that works on any website without requiring site owners to change anything
What we learned
- Nova 2 Sonic's tool use capability is a game-changer — the model can call functions mid-conversation to fetch real data, making voice assistants that are actually grounded in current information
- Bidirectional streaming creates remarkably natural voice experiences — the prosody adaptation makes it feel genuinely conversational, and the low latency enables real-time interaction
- Nova Act's natural language approach to browser automation is uniquely powerful for accessibility — describing what you want to do ("fill in my email") is far more accessible than writing CSS selectors
- Multimodal embeddings enable surprisingly accurate semantic search across page elements — users can find buttons and links just by describing what they're looking for
- Building for accessibility forces deeper UX thinking — every interaction needs voice, keyboard, and visual pathways, which makes the product better for everyone
What's next for NovaAccess
- Browser extension — Inject the accessibility layer directly into any page as a Chrome/Firefox extension, no separate app needed
- Persistent vector store — DynamoDB-backed knowledge base that remembers frequently visited sites, user preferences, and learned accessibility patterns
- Multi-language voice support — Nova 2 Sonic supports French, German, Spanish, and Italian via polyglot voices (tiffany and matthew)
- Extended thinking — Leverage Nova 2 Lite's reasoning mode (
reasoningConfig) for complex multi-page accessibility audits - WCAG compliance certificates — Exportable PDF reports for enterprise compliance teams
- AWS Amplify deployment — Production hosting with Cognito authentication and S3 storage for page snapshots
Built With
- amazon-bedrock
- amazon-nova-2-lite
- amazon-nova-2-sonic
- amazon-nova-act
- amazon-nova-multimodal-embeddings
- aws-sdk-v3
- fastapi
- next.js-16
- python
- react-19
- shadcn/ui
- tailwind-css-4
- typescript
- websocket
Log in or sign up for Devpost to join the conversation.