CyBuddy

Inspiration

In 2007, a 16-year-old in Andhra Pradesh, India wanted to study science after his Class 10 board exams. India's Central Board of Secondary Education refused — blind students, the rule said, could only study arts. He fought the board, won the right to study STEM, and topped his class. When he applied to IIT, BITS, and every other major Indian engineering institute, every single one rejected him because he was blind.

His name is Srikanth Bolla. He went on to become the first international blind student at MIT Sloan, founded Bollant Industries in 2012 (which now employs hundreds of people with disabilities and was backed by Ratan Tata), and was mentored as a teenager by Dr. APJ Abdul Kalam — the late former President of India and one of the country's most celebrated scientists.

Srikanth's own words on the rejections cut deep:

"If IIT didn't want me, I didn't want IIT either."

Srikanth made it to MIT. Most students like him don't — and the barriers aren't just in elite institutions, they're in the small daily friction of campus life. Reading a dense Canvas page when you have dyslexia. Catching the right CyRide when the screen at the stop is too small to read. Parsing a wall of professor emails when ADHD makes it impossible to find the one that actually has a deadline. Juggling Canvas, Outlook, CyRide, and Workday — four different apps, four different logins, four different ways of being overwhelmed.

Students aren't failing campus. Campus is failing students.

I built Cybuddy for everyone Srikanth's story stands for — and for every Iowa State student who just wants their day to be less hard.

What it does

Cybuddy is a voice-first, screen-reader-native campus companion built on three accessibility surfaces:

1. Just ask Cy. Tap the floating mic on any tab. Cy answers questions grounded in your real data — "What's due this week?", "Anything important in my inbox?", "What CyRide buses are running right now?", "What do I need to do in Workday?" The backend pulls live context from Canvas, Outlook, CyRide, and Workday in parallel before every reply, so Cy never makes things up.

2. Vision. A dedicated tab with a live camera preview and an auto-describe loop. Point your phone at anything — a CyRide bus pulling up, a syllabus page, a building entrance — and Cy describes it out loud every few seconds. Tap Start to begin, Stop when done, or Describe once for a single look. Built for blind and low-vision students who currently get pushed into "observer" roles in lab sections and out of independent campus navigation.

3. Plain-English mode. Inside the voice assistant, ask Cy to explain a confusing email or syllabus and Cy rewrites it in clear, scannable language — built for dyslexia, ADHD, and ESL students.

Every screen, every button, every state has accessibilityLabel and accessibilityHint props for iOS VoiceOver and Android TalkBack. The Vision description panel uses an accessibilityLiveRegion so screen-reader users hear new descriptions announced automatically without re-focusing.

How I built it

Frontend — React Native (Expo SDK 54), TypeScript, expo-router. Tab navigation with a floating voice button that's reachable from every screen. Live camera via expo-camera, image resizing via expo-image-manipulator (downscaled to 1024px JPEG before upload to keep vision-token cost down), audio playback via expo-av.

Voice reasoning — Moonshot Kimi K2.6 on Cloudflare Workers AI, called through a personal proxy worker so the API token never ships in the mobile bundle. K2.6 is a frontier-scale reasoning model with 262k context, multi-turn tool calling, and vision input — overkill for "what's due tomorrow," exactly right for "rewrite this dense paragraph for me."

Vision — Same Kimi K2.6 model, multimodal call with image_url content blocks. Backend resizes safety, system prompt is tuned for "describe to a blind student — focus on what helps them act, read any visible text aloud verbatim."

Speech-to-text — Whisper Large v3 Turbo on Groq (~164× realtime, sub- second transcripts).

Text-to-speech — @cf/myshell-ai/melotts on Cloudflare Workers AI, also through the proxy worker. Free tier, no per-voice paywall.

Backend — Node.js + TypeScript + Express. OAuth wrappers for Microsoft Graph (Outlook), Canvas LMS, Workday, plus a live CyRide feed. JWT auth with a graceful auto-logout when stored tokens stop verifying.

Auth context grounding — every voice query triggers a parallel Promise.allSettled over Outlook (top important emails + upcoming events), Canvas (upcoming assignments), CyRide (live active routes), and Workday (notifications + action items). The result is formatted into a === LIVE CONTEXT === block injected into the system prompt before Kimi K2.6 sees the user's question.

Challenges I ran into

Kimi K2.6 is a reasoning model. First voice replies came back with content: null and finish_reason: "length" — the model burned all 220 output tokens on internal reasoning_content before writing a single visible word. Bumped max_tokens to 2048 for chat and 4096 for vision. The reasoning is invisible to the user; it just needs headroom.
ElevenLabs free tier blocks library voices via API (paid_plan_required). Built the whole TTS path against ElevenLabs Rachel before discovering this. Pivoted to Cloudflare's melotts through the existing AI proxy worker — same architecture, free, decent quality, works for every demo.
JWT secret rotation silently broke the dashboard. When I introduced an .env file mid-build, every stored token went stale and every authenticated endpoint returned 403. The mobile app just sat there with a blank dashboard. Fixed by wiring an authFailureHandler callback that the API interceptor calls on 401/403 to flip isAuthenticated back to false, so the app actually routes to the login screen.
Expo Go can't load Picovoice — the wake-word native module needs a custom dev client, which would have eaten too much of my remaining hackathon time. Shipped without "Hey Cy" wake word; the voice assistant is one tap away from every tab via a floating mic, and Vision auto-runs in a continuous loop. Wake word stays in the codebase, ready for the standalone build.
The mock data leaked. First voice replies contained "this is sample data because Microsoft auth isn't set up yet" — Kimi was reading my literal placeholder strings out loud. Rewrote the seeded emails to look like real student emails (graduation check, professor schedule change, registration window) and added a hard system-prompt rule: never describe data as sample/mock/dev/demo.

What I learned

Accessibility tech wins or loses on trust under pressure, not feature count. A blind student running between class and a CyRide pickup can't tolerate a "sorry, I missed that" from their assistive tool.
The 2025-era multimodal models are just good enough for this. Two years ago Vision needed a research lab; now Kimi K2.6 reads "Iowa State Library" off a sign and says it back to you.
For solo hackathon work, free-tier provider stacking is its own skill. Each provider blocks something different — ElevenLabs paywalls voices, Groq paywalls volume, OpenAI paywalls cents-per-call. Cloudflare Workers AI was the unlock here.
Most importantly: blind students don't need a different curriculum. They need the existing curriculum to stop assuming sight.

What's next

Real Microsoft / Canvas / Workday OAuth (currently dev-bypass with mock data). Architecture is in place — just needs the credentials.
Custom "Hey Cy" wake word via Picovoice — already coded, just needs a standalone iOS/Android build to load the native module.
Conversation mode — once you tap the voice button, keep listening and responding in a loop until you close the screen. No more re-tapping.
Pilot with the ISU Student Accessibility Services office.
A Srikanth shouldn't have to leave India — or any country — to do science.

Built With

android
anthropic
canvas-lms
claude
elevenlabs
expo.io
groq
ios
microsoft-graph
node.js
oauth
openai
react-native
twilio
typescript
whisper

Updates

Sunil Kumar started this project — May 02, 2026 02:07 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.