Inspiration

dwata is a product I have thought about multiple times before due to my own frustration about how scattered my invoices, bills, discussions, project/work updates, travel history and so many other things are. I have been meaning to attempt this idea using LLMs for extracting data, particularly smaller models like Gemini 3 Flash or even smaller. I can imagine people using dwata locally and connecting

What it does

dwata reads emails (Gmail uses OAuth2 and others use username/password), stores all emails locally, allows search and financial data extraction. For financial data extraction, dwata only allows reading from emails subject/body, not their attachments. The extraction uses regex and it runs locally. Building regex patterns uses Google Gemini 3 Flash.

How we built it

dwata was built entirely for this contest using coding agents (Claude Code, opencode, Codex). I wanted to focus on the small model Gemini 3 Flash and the regex extraction is very tuned to this purpose with this model only. There is only one AI agent for now. The backend and extractor are in Rust. It uses Actix Web, SQLite, Tokio and friends. All user credentials are stored in OS native keychain. The frontend are in Typescript, SolidJS, Tailwind CSS, Daisy UI. My first focus was the core email API server, email downloader, GUI. Then I focused on the AI agent to build regex for a given email. Candidate Emails (which can be processed via regex) are scanned using a simple detection algorithm: group by sender, check if all emails have content with close word-edit lengths, sort by number of emails in group. One sample email is sent to the AI agent when user wants to extract pattern for this email group. AI agent is given a system prompt setting the context and only two tools/functions: test pattern and save pattern. This enables a model like Gemini 3 Flash to work so well.

Challenges we ran into

Tweaking the financial data structure with the goal to extract as much as I can and prepare to do much more in the future was a challenge. I also started working on extracting projects, tasks and realized it would be too much work to design agents for each. So I cut down on other features and focused on financial data and a very efficient software for this.

Accomplishments that we're proud of

It works, is cost efficient and is fast!

What we learned

dwata is built on a lot of ideas and experiments I have been doing the last 2 years. I am so happy that it all came together into a product. The contest taught me to focus on user facing features, to deliver something useful and usable.

What's next for dwata

dwata will process more, much, much more. All locally, using small models. Access to Google's data is critical since that is where most user's are. Gmail, Google Drive, even Photos in future. Running this as a self-hosted, no-maintenance produce on Google Cloud is an option I have in mind. Opening this for multi-user teams is also there.

Built With

Share this project:

Updates