About the Project

Inspiration

This project was inspired by a very practical problem: searching large, confidential technical documents.
Engineering standards and specifications often span hundreds or thousands of pages, and finding the exact clause that answers a question can be slow, error-prone, and frustrating. Existing “chat with your document” tools are usually cloud-based, opaque, and unsafe for sensitive material. I wanted a system that works entirely locally, prioritizes traceability over fluency, and behaves more like an internal engineering tool than a chatbot.


What I Built

I built a fully on-prem Retrieval-Augmented Generation (RAG) system for mechanical engineering specifications.

At a high level, the system:

  • Parses structured PDFs into hierarchical sections and paragraph-level clauses
  • Indexes the content using hybrid retrieval:
    • BM25 keyword search via SQLite FTS5
    • Semantic vector search using sentence embeddings
  • Returns exact source paragraphs with section paths and page numbers
  • Optionally uses a local LLM (Llama 3 via Ollama) to:
    • Suggest improved search queries
    • Generate strictly grounded summaries with explicit citations

The frontend exposes this through a simple search interface with buttons for Search, Suggest, and Summary, plus direct links to open the source PDF at the correct page.


What I Learned

This project taught me several important lessons:

  • Parsing matters more than models: Retrieval quality depended far more on correct paragraph reconstruction, section hierarchy, and noise removal than on LLM choice.
  • Vector search alone is not enough for standards and specifications; exact terms, numbering, and “shall” clauses require traditional keyword search.
  • Grounding and citations are critical for trust. A system that can say “not found in the document” is more valuable than one that always answers.
  • Running LLMs locally introduces operational concerns (process lifecycle, availability, graceful degradation) that cloud APIs hide.
  • Simple, explicit system design often beats complex “agentic” abstractions for reliability and auditability.

Challenges

Some of the main challenges included:

  • PDF structure variability: Headings, wrapped lines, headers, and footers required custom heuristics to avoid polluting the text.
  • FTS5 integration details: Correctly joining full-text search results back to the original rows via rowid was subtle but essential.
  • Preventing hallucinations: The LLM had to be constrained to only summarize retrieved content and fail safely when evidence was missing.
  • Local LLM availability: The system needed to function even when the LLM was offline, requiring clear fallback paths.
  • End-to-end integration: Connecting parsing, indexing, retrieval, LLM inference, API endpoints, and frontend interactions into a coherent, demoable system. the frontend was build with Gemini3 API in gemini studio

Outcome

The result is a compact but realistic internal tool: a private, auditable search engine for technical standards.
It demonstrates how modern retrieval and language models can be applied responsibly in engineering and compliance contexts—without sacrificing security, correctness, or traceability.

Built With

Share this project:

Updates