AuditArk

Inspiration

AuditArk was inspired by the everyday struggle of manually processing receipt data. Teams often spend hours typing totals, vendor names, and dates into spreadsheets, and small mistakes can cause major reporting issues later. We wanted to build a practical desktop tool that automates receipt extraction while still giving users full control to review and correct data before final reports.

What it does

AuditArk helps users upload receipt batches, extract structured fields using OCR, review and edit records, and generate clean exports for reporting. It is designed for offline-first use so data remains available and secure on local systems.

How we built it

We built AuditArk as a desktop app with a frontend for batch management and record review, and a backend service for OCR, parsing, and reporting logic. The system processes receipt images, converts text to structured entries, and stores results in a local database. We also added export workflows so users can move validated data directly into reporting pipelines.

Challenges we ran into

OCR inconsistency across different receipt layouts, print quality, and image angles
Packaging backend dependencies into a stable desktop build
Balancing extraction speed with accuracy and editability
Designing a workflow that is fast for bulk processing but still transparent and trustworthy

Accomplishments that we're proud of

Built an end-to-end receipt workflow from upload to export
Created a usable correction flow so users can confidently validate extracted data
Delivered an offline-capable desktop setup with integrated backend processing
Improved reliability of structured extraction for real, messy receipts

What we learned

We learned that production OCR is not just about model quality; it is also about strong fallback logic, validation workflows, and user experience. We also learned that packaging and deployment are core engineering challenges in desktop products, not just final steps. Measuring extraction quality with a consistent metric helped us prioritize practical improvements:

$$ F_1 = \frac{2PR}{P + R} $$

where P is precision and R is recall.

What's next for AuditArk

Improve extraction for edge-case and low-quality receipts
Expand vendor normalization and category intelligence
Add richer analytics and reporting templates
Introduce regression benchmarks to continuously track OCR and parsing quality

Built With

bun
fastapi
numpy
onnx-runtime
opencv
pillow
python
rapidfuzz
rapidocr
react
rust
sqlite
tailwind-css
tauri
typescript
uvicorn
vite

Updates

Alok Nath started this project — Apr 16, 2026 01:20 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.