WhatTheFont

Inspiration

While volunteering in Vietnam and reading with children, we noticed some kids would skip or stumble over words in ways that looked like small mistakes. Teachers and parents often assumed inattention or carelessness. In many developing countries such as Vietnam, dyslexia and other reading differences are under-recognized, so struggling readers may not get the right support early.

That gap motivated WhatTheFont: a browser extension that does two things at once. It can make the web easier to read (typography, spacing, calmer visuals) for people who benefit from those adjustments, and it can help raise awareness by approximating how fragmented reading can feel. We wanted something practical for learners and a conversation starter for families and educators.

What it does

Assist mode: dyslexia-friendly typography such as Lexend, wider line, letter, and word spacing, larger body type, warm tint, eye-guiding text gradient, less clutter from common ad regions.
Simulator: approximates unstable decoding so peers and caregivers can feel why “just read carefully” is not enough.
On-page AI: highlight text to summarize (Gemini) or to read aloud (ElevenLabs).
Images to text: upload screenshots or photos and live camera capture for text recognition and simplified reading output (recognition pipeline in the options / extraction flow).
Personal dashboard: individualized insights so users can see habits and progress over time (reading supports used, engagement patterns, and improvement trends you define with your metrics).

How we built it

The extension runs alongside ordinary websites: when you turn on assist or simulate, it adjusts how the page looks and feels in the browser. When you select text, a small bar appears so you can summarize or listen without leaving the site; those actions go through the extension so API keys stay off the page you’re visiting.

Your choices (like assist on/off) are saved so they still apply the next time you open a tab. In the extension’s own settings / tools area, you can store keys and use upload or camera to read text out of images (screenshots, photos, or a live frame). Under the hood we used Chrome’s current extension format, Vite, and TypeScript so the project stayed fast to iterate during the hackathon.

Challenges we ran into

Gemini sometimes returned empty or incomplete answers, especially once newer models and token budgets behaved differently than we first assumed.
Google’s stack for text recognition on uploaded and live images took extra integration and tuning (quality, latency, and handling messy real-world photos).

Accomplishments that we're proud of

One tool combining accommodations, empathy simulation, and AI reading support.
Keys and network calls kept in the extension layer, not injected into arbitrary pages.
Image upload + live capture UX scaffolded alongside reading modes.

What we learned

Reading friction is spatial and rhythmic (spacing and line feel matter as much as font choice).
MV3 async messaging and moving APIs reward defensive, user-visible error handling.