-
-
Web - Checklists Page Calendar View
-
Web - Checklists Page
-
Web - Main Page
-
Web - Semantic Search powered by Gemini Embedding
-
Web - Analysis Result powered by Gemini 3: CheckList
-
Web - Analysis Result powered by Gemini 3: Summary and Cultural Glosssray
-
Usage of Gemini3
-
H5 - Checklists Page Calendar View
-
H5 - Main Page 2
-
H5 - Checklists Page
-
H5 - Document Details Page
-
H5 - Main Page
Inspiration
Germany is a modern economy running on an analog backend. For the 14M+ foreign residents, the mailbox can be a daily source of anxiety—because without language and cultural context, it’s hard to tell what a letter is, how urgent it is, and what happens if you ignore it.
According to Expat Insider 2024 surveys, 67% of expats find German difficult to learn. Living here ourselves, we realized the core problem isn't just translation; it's context and consequences. A standard translator can read the words on a letter, but it won't tell you if it's a mandatory fee, a contract renewal trap, a payment reminder (Mahnung), or just marketing dressed like an invoice?
We built KlarDocs (from "Klar," meaning Clear) to turn German paperwork into clarity + next steps.
What it does
KlarDocs is an AI copilot for German paperwork: it turns every scanned document into an organized, searchable record with actionable tasks.
- Explains, Not Just Translates: Instead of literal translation, it adds cultural/procedural context (e.g., what a Nebenkostenabrechnung actually means in German renting).
- Action Checklist + Deadlines: It extracts obligations, due dates, amounts, IBAN/payment references, and converts them into a trackable checklist and calendar-ready items.
- Cross-Lingual Semantic Search: Users can search in their native language (e.g., "Internet Contract 2024") and find the right German document (e.g., "Telekom_Vertrag.pdf") - even if the keywords don't match.
- Auto-Organization: It categorizes documents (Tax, Housing, Legal) and flags urgency (High, Medium, Low).
How we built it
We leverage the Google Cloud ecosystem and Gemini 3 Pro to build KlarDocs.
- Google Document AI: We use Google DocAI for enterprise-grade OCR and layout analysis to extract raw text from scanned PDFs and photos.
- Gemini 3 Pro Reasoning: We utilized Gemini 3 Pro for its advanced reasoning capabilities and native structured output. By feeding it OCR text and enforcing a strict JSON schema, we ensure the AI reliably distinguishes between a "marketing offer" (low urgency) and a "payment reminder" (high urgency) without hallucinating formats.
- Gemini Embeddings: We generate vector embeddings for every document using Gemini. This allows us to perform cosine similarity searches in PostgreSQL (via pgvector), enabling the cross-lingual search feature.
- Tech Stack: Built with Next.js + FastAPI and deployed on Google Cloud Run. We use Supabase for Auth and the PostgreSQL database.
Challenges we ran into
The Multimodal vs. Modular Dilemma (Token Efficiency): Initially, we attempted to feed raw document directly to Gemini 3's multimodal input. While impressive, the accuracy is obviously affected by the quality of documents. In addition, the token consumption and analysis time is an issue as we provide "re-analyze" functionality and each re-analysis would re-process the same document. We then pivoted to a two-stage pipeline: using Google Document AI for specialized OCR to "normalize" the input first. We then feed the extracted text to Gemini 3 Pro. This architectural shift reduced token usage, lowered latency, and ensured Gemini could focus purely on reasoning rather than decoding pixels.
The "Marketing vs. Reality" Problem: German companies often send marketing letters that look like invoices (e.g., "You have €500 credit!"). Standard LLMs often extracted this as a "refund." We had to heavily lean on Gemini 3's reasoning capabilities to analyze the intent of the document, not just the numbers, to correctly classify these as "Marketing" rather than "Financial Documents."
Strict JSON Adherence: In earlier models, getting consistent JSON for our "Action Checklist" (dates, priority, boolean flags) was hit-or-miss. Integrating Gemini 3's native structured output support completely solved this, allowing our frontend to render UI components directly from the AI response with significantly improved consistency.
Accomplishments that we're proud of
- Cross-Language Search: Seeing the Semantic Search work for the first time was amazing. Searching for "extra costs for apartment" and instantly getting the "Betriebskostenabrechnung" document proved that we truly broke the language barrier.
- Seeing the Analysis Result: We successfully processed a complex Nebenkostenabrechnung and had the AI correctly summarize: "You owe €150.50 because your heating usage increased, payable by Jan 31st." This is the core value proposition realized.
What we learned
- Reasoning > Translation: For this use case, translation is a commodity. The real value lies in the model's ability to act as a consultant—reasoning through deadlines and legal implications.
- Structure is Important: For AI to integrate into a real software product, structured output (JSON) is a quite critical feature.
- The "Retention Burden": We learned through user research that "losing" a document is as big a fear as "not understanding" it, validating our decision to build a full document management system (DMS) rather than just a translation tool.
What's next for KlarDocs: AI Copilot for German Paperwork
Short-term:
- Beta Launch & User Validation: Our immediate next step is to put KlarDocs into the hands of real users in the Berlin expat community. We plan to launch a closed Beta to gather feedback, ensuring we are solving the right problems before scaling.
Mid-term:
- RAG-Enhanced Document Understanding: We want to further research how to better leverage Gemini 3 to improve document analysis quality. Specifically, we plan to explore how Retrieval-Augmented Generation (RAG) could be utilized to enhance Gemini's knowledge of Germany-specific documents and terminology, and other bureaucratic terms that require cultural context beyond literal translation.
- Granular Document Explanation: We want to explore how to provide more detailed, section-by-section explanations of documents. By leveraging the layout analysis data from Google Document AI's OCR, we could potentially highlight and explain specific regions of a document (e.g., a particular table row or clause). The exact product form is still being explored.
Long-term:
- Digital Mailbox Integration: Germany has several digital mailbox services (e.g., Caya, Dropscan, POSTSCAN by Deutsche Post) that receive physical mail on behalf of users and scan it for online access. We want to explore partnerships with these services to create a seamless end-to-end experience—from receiving and scanning mail to understanding it and knowing what to do next.
Log in or sign up for Devpost to join the conversation.