Inspiration

Everyone has felt it: a hospital bill, an eviction notice, a benefits letter, or discharge instructions lands in front of you — dense, intimidating, and written for the institution that sent it, not for you. In that moment people don't need a 40-page explainer. They need someone calm beside them who says "here's what this actually means, and here's exactly what to do next."

We built ENVIS to be that someone — a crisis-to-action translator that turns the scariest piece of paper in your life into clear, plain-language steps, without making you feel small for not understanding it.

What it does

You drop in a document — a photo, a PDF, or pasted text — and ENVIS:

  • Reads it like a detective and gives a calm, structured breakdown: the bottom line, what stands out (flagged 🔵 key / 🟡 watch-out / 🟢 good), a line-by-line verdict on charges, what's missing, and your exact next moves.
  • Annotates the original image — OCR + AI highlight the real lines on your document so you can hover and see what each one means.
  • Localizes everything — it infers the country, currency, and laws from the document and judges costs/rights against local norms (e.g. Indian hospital schemes like Ayushman Bharat vs. U.S. charity care).
  • Knows your rights — a research agent looks up the institution's policies and your protections.
  • Acts for you — it drafts the email/letter to send, builds reminders and to-dos for real deadlines, finds support near you and benefits you may qualify for, and exports a shareable one-pager (PDF + QR).
  • Meets you where you are — replies translate into 10 languages, with a larger-text accessibility mode and a "Night Calm" theme.

Crucially, ENVIS explains and suggests — it never makes a legal, medical, or financial decision for you. You stay in control.

How we built it

  • Frontend: Next.js (App Router) + TypeScript + Tailwind, with GSAP for the calm, breathing micro-interactions.
  • AI advisor: OpenAI GPT-5 family behind a heavily engineered system prompt that forces reasoning over the actual numbers instead of generic checklists, plus tool-calling to create reminders/to-dos.
  • Documents: Tesseract.js runs OCR in the browser; a second model maps each line to a plain-language meaning and an importance color to drive the highlights.
  • Data & auth: Supabase (Postgres + Auth) with row-level security so every user's threads, messages, reminders, and profile stay private.
  • Agentic research: a LangGraph agent powers the "My rights" briefing.
  • Extras: MapTiler for the 3D "support near you" map, and client-side PDF/QR generation for the one-pager.

We also score how much simpler we made the text, using a Flesch-style reading-ease score shown before → after:

$$\text{ease} = 206.835 - 1.015\left(\frac{\text{words}}{\text{sentences}}\right) - 84.6\left(\frac{\text{syllables}}{\text{words}}\right)$$

Challenges we ran into

  • "Is it stuck?" latency. GPT-5 at high reasoning effort gates its first token on finishing a long thinking phase, and we were also forcing tool-calls on the first turn — a second full model pass before any text appeared. We fixed it by streaming the analysis token-by-token, tuning reasoning effort for a fast first token, and moving reminder/to-do creation to a non-blocking pass after the reply renders. Time-to-first-word dropped from a minute-plus of dead air to a few seconds of live, streaming text behind a document-aware progress trace.
  • Aligning OCR with AI. Tesseract's noisy line boxes had to line up with the model's analysis so highlights landed on the right text — lots of merging, filtering, and importance-tagging.
  • Localization without assumptions. Getting the model to reason about local prices, laws, and aid programs — and to clearly signal confidence when estimating — instead of defaulting to U.S.-centric advice.
  • Big payloads & privacy. Base64 document images were too large for ordinary DB columns, and RLS meant being deliberate about exactly what each request could read or write.

What we learned

  • With reasoning models, perceived speed is a product decision: streaming + effort tuning + decoupling side-effects matters as much as the answer itself.
  • The hardest part of "AI for vulnerable moments" isn't the model — it's the tone, the responsible-use guardrails, and the accessibility that make someone feel seen, clear, capable, and in control.

What's next

Streaming the follow-up conversation, more document types and languages, and offline-friendly OCR so ENVIS works even on a shaky connection.

Built With

Share this project:

Updates