The project was inspired by the large pricing gap between consumer-grade and enterprise-grade Excel automation tools. Most enterprise solutions charge over $2,000 per seat, leaving no accessible option for individuals or small teams. To solve this, I built my own Excel generation model that delivers enterprise-level precision at consumer-level cost.

At its core, the system uses one of the most token-efficient embedding and agent architectures for spreadsheet understanding. The model compresses large workbooks into structured schema summaries, rows, columns, formulas, and dependencies—allowing the generator to reason over the data with minimal tokens. This design makes generation cost-effective while preserving accuracy and contextual awareness.

The generation process runs in multiple deterministic passes:

Schema extraction and summarization

Block-level fetch planning

Recipe generation with validation of references and formulas

This ensures correct cell-level operations and consistent reproducibility, even in complex financial models.

Technical Challenges

Office.js Integration: Creating a stable Excel add-in that bridges local workbooks with a FastAPI backend.

Schema Compression: Designing a compact yet expressive schema to represent thousands of cells.

Structural Anchoring: Implementing a method that detects contextual deltas between cells to identify key “anchors,” improving context relevance and dependency tracking.

What I Learned

Building this system taught me how to balance semantic compression, deterministic execution, and real-time LLM orchestration, key ingredients for bringing high-fidelity Excel intelligence to consumer users.

Built With

  • excel-javascript-api
  • fastapi
  • langchain
  • langgraph
  • office.js
  • openai-gpt-5
  • pydantic
  • react
  • vite
Share this project:

Updates