VoiceNavAI – Serverless Voice-Controlled Accessibility

Architecture
Ui

Web accessibility tools often require screen-reader expertise or browser extensions that users can’t install on shared machines. We wanted a friction-free, speech-only layer that works on any single-page app, costs (almost) nothing to run, and shows how far AWS’s serverless stack—and Bedrock—can go in democratizing access.

What it does

Record voice in the browser.
Upload clip to S3.
Lambda #1 – StoreConn-Voice
($connect / $disconnect routes) saves & prunes WebSocket connectionIds in DynamoDB.
Lambda #2 – Transcribe Trigger
S3 event starts an Amazon Transcribe async job.
Transcribe drops JSON in transcribe-output/ → Lambda #3 – Bedrock Intent runs.
Bedrock (Claude-3 Sonnet) returns an intent such as
{"action":"click","selector":"#nav-book"}.
Lambda #3 fan-outs the intent via API Gateway WebSocket to open tab.
Amplify hosting forFront-end JavaScript clicks/navigates/inputs text—hands-free control.

How we built it
Three Python 3.12 Lambdas:

StoreConn-Voice (WS connect/disconnect)
Transcribe Trigger (start job)
Bedrock Intent (parse transcript → Bedrock → broadcast)
Minimal SPA: plain HTML/JS; no front-end framework.
Prompt-engineered Bedrock to output strict JSON and whitelisted selectors.
Live-reloading Amplify hosting for rapid UX tweaking.

Challenges we ran into
Escaping {} in Python .format killed Bedrock calls (KeyError "action").
API Gateway WS management URL vs. client WSS URL—double “@connections” 404s.
Bedrock sometimes hallucinated selectors; solved with an alias map and stricter prompt.
Transcribe async adds ~45 s latency; streaming wasn’t available in free tier.

Accomplishments we’re proud of

Zero servers—full pipeline idles at $0.18/mo (S3 + Dynamo storage).
Works on any SPA without code changes—just drop in app.js.
Live demo navigates between tabs and fills forms with voice only.
Added TTL pruning → no zombie WebSocket IDs, no manual clean-up.

What we learned

Bedrock’s Claude-3 is shockingly good at structured JSON if you remind it every prompt.
API Gateway WebSockets + Dynamo TTL = nearly effortless real-time fan-out.
Small UX touches (on-screen log overlay, auto-WS reconnect) make demos rock-solid.

What’s next for VoiceNavAI
Transcribe Streaming to cut first-response time to <3 s.
Multilingual commands (update prompt + language-auto-detect).
ARIA role introspection: auto-generate selector whitelist per page.
Chrome extension wrapper to inject app.js on any site, no dev changes required.

Built With

amazon-amplify
amazon-dynamodb
amazon-web-services
amplify
api
apigateway
aws-cdk
bedrock
css
html
iam
javascript
lambda
python
s3
transcribe

Submitted to

AWS Lambda Hackathon

Created by

Worked on the requirement and process ensuring that it is fit for purpose to the target end users needs.

Bonaventure Okhuoya
I served as our project coordinator + lead developer:
• Team orchestration – set up our Teams + GitHub workflow, ran daily stand-ups, and kept the roadmap on track.
• Back-end lead – contributed to most of the Python for the three Lambda functions (connection store, Transcribe trigger, Bedrock intent router) and their unit tests.
• AWS infrastructure – stood-up S3, DynamoDB, API Gateway (WebSocket), IAM roles/policies
• Networking & troubleshooting – resolved CORS / WebSocket 4xx issues and optimized cold-start times.
• Code review & mentoring – reviewed every PR, paired with teammates on front-end integration, and documented the architecture.

While I handled the core serverless plumbing, every teammate shaped features, UX, and demo polish—this was very much a group win.

Michael Uanikehi
DevOps and Cloud Infrastructure. collaborating and building solutions that scale and solve social good.
I worked of sections of the system design, deployment of the resources and writing of the backend logic for the apigateway and lambda function to process the input

William Obiana
Isreal Urephu
Habeeb Aminu

Updates

Michael Uanikehi started this project — Jun 30, 2025 07:03 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.