Inspiration
I kept missing job opportunities. Not because I wasn't qualified — but because I couldn't bring myself to fill out another application on my phone. Tiny fields, endless scrolling, and those dreaded open-ended questions: "Why do you want this role?" I'd open the form on the subway, stare at it, and close the tab. Too much friction, not enough time. I built Clara because I was tired of losing opportunities to form fatigue.
What it does
Clara is a mobile-first AI form-filling companion. Snap a screenshot, upload a PDF, or paste a URL — Clara uses Gemini Vision to read any form, extract the fields, and guide you through filling it with a simple conversation. Struggling with an open-ended question? Clara drafts an answer based on your profile and the role, and you just approve or tweak. It remembers your info, so the next form takes seconds. It even generates resumes and cover letters on the spot.
How we built it
- Gemini 2.5-flash with Vision for form field extraction via bounding box detection
- Gemini TTS for voice output with barge-in support
- 3-layer smart prefill engine: learned aliases → Gemini semantic matching → keyword fallback
- Flask backend (~3000 lines) handling chat, vision, document generation
- Vanilla JS SPA (~3800 lines) for a fast, mobile-first experience
- Google Cloud Run for deployment, Firestore for profiles/sessions, Cloud Storage for documents
Challenges we faced
- Field matching ambiguity: Forms label fields inconsistently ("Phone" vs "Mobile" vs "Cell"). We solved this with a 3-layer matching system that learns from user corrections.
- Open-ended answer coaching: Getting AI to draft answers that sound like you, not generic fluff — required careful prompting with profile context.
- Mobile-first UX: Designing a chat interface that feels natural while filling complex forms required constant iteration.
- PDF annotation: Rendering filled answers as an overlay on the original form image using bounding box coordinates from Gemini Vision.
What we learned
- Gemini Vision is remarkably good at understanding form structure from screenshots — even messy, hand-designed PDFs.
- The real barrier to form completion isn't typing — it's the mental load of open-ended questions. AI coaching removes that friction.
- A learning loop (remembering field→profile mappings) dramatically improves the experience over time.
What's next for Clara
- Browser extension: Fill forms directly on any website
- Clara Profile API: Let third parties auto-fill with user permission
- Expanded document generation: Tax forms, visa applications, legal paperwork
Built With
- api
- beautifulsoup4
- css
- flask
- fpdf2
- gemini-tts
- gemini-vision
- google-cloud
- google-cloud-firestore
- google-cloud-run
- google-gemini-api
- html
- javascript
- python
- speech
- web
Log in or sign up for Devpost to join the conversation.