Inspiration

Every month, we deal with a pile of paper or email receipts like expenses from groceries, dining out, travel, and more. Manually entering these into spreadsheets for budgeting is tedious and error prone. So, I wanted to build something that automates this entire process using serverless architecture and AI. That’s how ReceiptSense was born, an app that extracts data from uploaded receipts and categorizes it intelligently using AWS services.

What it does

ReceiptSense lets users upload images of receipts and:

  • Automatically extracts details using Textract .
  • Normalizes vendor names with Amazon Bedrock (Claude 3).
  • Classifies vendor categories (e.g., groceries, fuel, dining).
  • Saves and displays all receipts in a searchable, filterable table.
  • Visualizes monthly spending with a dynamic pie chart.
  • Exports all data as a CSV file.

How I built it

I used the AWS SAM framework to build and deploy the backend, which includes:

Lambda Functions:

  • processor.py – Handles Textract results, normalization, classification.
  • presign.py – Generates pre-signed S3 URLs.
  • query.py – Filters records by vendor and month.
  • export_csv.py – Generates downloadable CSV links.
  • delete_receipt.py – Deletes records from DynamoDB and S3.

AWS Services:

  • AWS Lambda – Core logic and serverless functions.
  • Amazon Textract – Extracts data from receipt images.
  • Amazon Bedrock – Uses Claude 3 to normalize vendor names and classify categories.
  • Amazon DynamoDB – Stores receipt records.
  • Amazon S3 – Stores uploaded receipt files and the web frontend.
  • Amazon API Gateway – Triggers Lambda for API endpoints.
  • CloudWatch – Logging and debugging.

The frontend is a static S3 website using HTML, CSS, and JavaScript.

Challenges I ran into

  • Textract Output Parsing: Had to navigate inconsistent results and build fallback logic for missing fields.
  • Vendor Normalization: Many receipts had vendor names in different formats (e.g., “COSTCO & COSTCO\nWHOLESALE”) which was solved via Claude and regex.
  • Filtering by Month: DynamoDB doesn’t support arbitrary queries, so I switched to in memory filtering in Lambda.

Accomplishments that I'm proud of

  • Built a full-featured serverless receipt manager with real-time AI categorization.
  • Implemented a clean, minimal UI with useful features like CSV export, delete, filters, and graphs.
  • Everything is hosted serverlessly, frontend on S3, backend on Lambda, with zero active servers.

What I learned

  • How to integrate Amazon Bedrock with Claude 3 for text processing tasks like normalization and classification.
  • The power (and quirks) of DynamoDB querying and Lambda function chaining.
  • Managing async operations and modal UI updates in a minimalist JavaScript frontend.

What's next for ReceiptSense

  • User Authentication – Add Cognito or Google Sign-In to support multiple users.
  • Recurring Spend Insights – Use ML to highlight recurring monthly charges or spending anomalies.
  • Claude Prompt Tuning – Make categorization more robust using examples and system prompts.
Share this project:

Updates