Inspiration
I grew up as my family's translator. If you're an immigrant kid, you know exactly what that means. You're eight years old sitting at the kitchen table, your mom slides a letter across to you, and suddenly you're trying to explain what "co-payment obligation" means in a language you learned from cartoons.
I did this for years. George did too. We never talked about it because honestly, everyone we knew did the same thing.
My family missed a Medicaid renewal because the notice was four pages of legal English and the deadline was buried on page three. Nobody caught it. By the time I found out, we had to start the whole process over. George had almost the same thing happen. His parents got a medical letter, thought they understood it, and his insurance lapsed. Missed paperwork, missed deadline, months of coverage gone.
25 million people in the US live in households where no adult speaks English well. Their kids translate starting at age 8. Researchers call it "language brokering" and it's a documented harm. Google Translate can translate words.
ChatGPT can explain things if you know how to prompt it. But neither of them remember that mom got a similar letter last month. Neither of them can call the office for her. And neither of them speak to her in a voice she trusts.
We built Orision because our families deserved better. And so do millions of others.
## What it does
You take a photo of a government letter. That's it. That's the starting point.
Orision takes that photo, cleans it up so every word is readable, pulls out the text, checks for sensitive info like SSNs and redacts it before it goes anywhere, then explains the whole thing in your parent's language. Plain words, no jargon. It highlights the deadline, the amount owed, what they actually need to do.
Then it reads it out loud in a voice that sounds like family, not a robot.
And if your parent needs to call the office about it? One button. The call connects with live translation so your mom can actually speak to the person on the other end, in her own language, without waiting for you to come home from
school.
Every document gets saved with vector embeddings. So the next time a letter comes from the same agency, Orision already knows the context. "This is your third letter from Medicaid, the last one said your copay was $0, this one is
saying it changed."
It supports Medicaid, USCIS, IRS, DMV, school, medical, lease, and car insurance documents. It works in 10+ languages from the moment you open the app. It's not a translator. It's the support system immigrant families were never
given.
## How we built it
We designed everything in Figma first because we knew if the UI wasn't dead simple, our parents wouldn't use it. Big buttons, big text, high contrast, mobile-first. Then we built it with the Cloudinary React AI Starter Kit, React 19, Vite, TypeScript, Tailwind, and shadcn/ui. Figma = https://docs.google.com/document/d/198Q0_-Q1vidwyIsa7sXKbO-HxPtW1FtSrxeu7BwrqKI/edit?usp=sharing
The backend is FastAPI in Python. When a photo comes in:
- Cloudinary enhances the image. Auto-orient, sharpen, boost contrast, convert to grayscale. We ran A/B tests and the enhanced version consistently gave better OCR results than the raw photo.
- Gemini 2.5 Flash Vision extracts the text.
- Gemma 3 immediately scans for PII. Social Security numbers, account numbers, dates of birth. Everything gets redacted before the text goes anywhere else.
- Four Fetch.ai agents on Agentverse take it from there. One parses the structure. One searches past documents in MongoDB Atlas using vector search for context. One drafts the explanation. One translates it. They're discoverable
through ASI:One and callable via OmegaClaw.
- ElevenLabs generates audio with a cloned voice so it sounds warm and familiar.
- Twilio handles live phone calls with Media Streams over WebSocket for bidirectional translated audio.
We built an MCP server so the entire pipeline is callable as a tool from Claude Desktop. You can type "explain this government letter" in Claude and it uses Orision under the hood.
Auth runs through Supabase with email and Google OAuth. Everything is deployed on Vultr Cloud Compute behind Caddy for automatic HTTPS at orision.us.
## Challenges we ran into
The reverse proxy almost broke us. We deployed behind Caddy and Twilio kept failing to connect. Took us way too long to realize that behind the proxy, the server thought its own URL was http://localhost:8000 instead of
https://orision.us. Twilio needs the real public URL to open a WebSocket. Simple fix once we found it, one environment variable, but finding it cost us hours.
PII ordering mattered more than we expected. Early on, the OCR text was getting sent to the translation service before PII redaction ran. That meant someone's Social Security number was briefly passing through an external API. We
caught it in testing and restructured the whole pipeline so Gemma redaction is always the first step after OCR. No exceptions.
When four agents are passing messages in sequence and something goes wrong, figuring out which agent dropped the ball is painful. We added fallbacks at every handoff. If the context agent times out, the drafter still produces an
explanation, just without historical context.
Every error message, every loading state, every button label had to work in 10+ languages. We couldn't just build it in English and translate later. The app cycles through languages on the welcome screen before you even pick one,
because we couldn't assume the user could read "Select your language" in English.
## Accomplishments that we're proud of
The moment we pointed my mom's phone at a real Medicaid letter and she heard the explanation read back to her in her own language. She looked at me and said "why didn't this exist before?" That's when we knew we built the right thing.
The full pipeline actually works end to end. Photo to explanation to voice to phone call. It's not a mockup. It's live at orision.us and you can try it right now.
We got four independent Fetch.ai agents coordinating reliably, registered on Agentverse, discoverable on ASI:One, and callable through OmegaClaw. That's a real multi-agent system, not a wrapper around one API.
The Cloudinary A/B test showed measurable OCR improvement. Enhanced images consistently extracted more accurate text than raw photos. We have the data to prove it.
We built an MCP server that makes the entire pipeline available as a tool inside Claude Desktop. That felt like the future.
And honestly, we're proud that we built something our own families can actually use. Not a demo. Not a prototype. Something real.
## What we learned
Image preprocessing matters more than model choice. We spent time comparing OCR models, but the real accuracy gain came from Cloudinary's enhancement pipeline. Sharpen the image, fix the orientation, bump the contrast. Suddenly the
same model extracts twice as much usable text.
Building for people who aren't tech-savvy forces you to be a better engineer. No one in our target audience is going to read an error toast and retry with different parameters. Either it works or it doesn't. That constraint made
everything simpler and more reliable.
Agent orchestration is powerful but fragile. The modularity and discoverability you get from Fetch.ai is worth it, but debugging distributed message passing between four agents is a completely different skill than debugging a monolith.
And we learned that this problem is way bigger than us. 25 million people. We built Orision in a weekend, but the problem it's solving has been there our whole lives.
## What's next for Orision
Expand document types to cover W-2s, immigration court notices, utility bills, and medical EOBs.
Add family sharing so multiple family members can contribute to the same document memory. When my brother scans a letter at home, I can see it from campus.
Build an SMS mode for parents who don't have smartphones. Text a photo to a number, get the explanation back as a voice message.
Partner with legal aid organizations to add action-step guidance. Not just "here's what this letter says" but "here's exactly what you need to do, and here's the number to call."
And eventually, make it so no kid has to be the translator anymore.
Built With
- asi:one
- caddy
- cloudinary
- context
- docker
- eact
- elevenlabs
- fastapi
- fetch.ai-agentverse
- figma
- framer-motion
- gemini-2.5-flash
- gemma-3
- godaddy
- model
- mongodb-atlas
- protocol
- python
- shadcn/ui
- supabase
- tailwind-css
- twilio
- typescript
- vite
- vultr

Log in or sign up for Devpost to join the conversation.