Inspiration
Software contracts are broken.
Every year, companies lose billions on scope creep because the Sales team promised X, the Legal team wrote Y, and the Engineering team built Z. We realized that humans are terrible at cross-referencing 200-page PDFs against technical specifications. We asked: What if we could deploy a "Shadow Council" of AI agents to read every line, find every loophole, and negotiate the fix before a single line of code is written?
What it does
SpecGap is an AI-powered "Shadow Council" for contract and specification review. Instead of a single LLM summary, it simulates a multi-round deliberation between three specialized agents:
- The Legal Hawk: Scans for liability traps and IP risks.
- The Tech Lead: Checks formatting, feasibility, and architectural gaps.
- The CFO: Analyzes financial leverage and pricing models.
Users upload a PDF/DOCX (tech spec, proposal, or contract), and SpecGap delivers:
- Conflict Detection: Finds where the "Proposal" contradicts the "Tech Spec".
- Gap Analysis: Highlights missing requirements (e.g., "Security section missing in SLA").
- The "Patch Pack": Automatically generates legal addendums, spec updates, and negotiation emails to fix the problems it found.
How we built it
We built a sophisticated Backend-as-a-Service (BaaS) using FastAPI and Google Gemini 3 Flash:
Multi-Agent Workflow: We used LangGraph to engineer a 3-round consensus mechanism. Agents don't just output text; they "debate" (cross-check) each other’s findings to reduce hallucinations.
Smart Document Processing:
- Chunking: We implemented a map-reduce algorithm to handle 200+ page documents, splitting by paragraph boundaries to preserve context.
- Sanitization: A custom security layer scrubs regex-matched prompt injection attacks (like "Ignore previous instructions") before the LLM ever sees the text.
- Robust parsing: We built a resilient JSON extractor that repairs malformed LLM outputs, ensuring the API never crashes on a bad generation.
Architecture: The specific agents (Legal, Business, Finance) run in parallel using asyncio for speed, with intelligent rate-limiting to handle API quotas gracefully.
Challenges we ran into
The Context Window Trap:
Uploading a large government contract immediately hit token limits. We solved this by building a custom chunker.py that summarizes sections individually before merging them, maintaining global context without blowing up the context window.
The "Yes Man" Problem: Early versions of the AI just agreed with the contract. We had to prompt-engineer "adversarial personalities" (e.g., telling the Legal agent to be "paranoid about liability") to get useful critiques.
Rate Limiting: Managing concurrent agent execution on the Gemini API triggered 429 errors. We implemented an exponential backoff strategy and staggered agent execution to maximize throughput without hitting limits.
Accomplishments that we're proud of
Security First: We built a dedicated
sanitizer.pythat neutralizes prompt injection attacks—a critical feature for enterprise adoption that most hackathon projects skip.Self-Healing JSON: Our parsing logic can detect and fix common LLM formatting errors (like trailing commas), making the backend incredibly stable.
The "Patch Pack": We didn't just want to find problems; we wanted to fix them. Generating the actual legal addendum text feels like magic.
What we learned
We learned that AI is better at finding gaps than humans, but worse at context. By forcing the AI to "roleplay" specific experts (Legal vs. Tech) and then cross-referencing their outputs, we achieved a much higher accuracy rate than a single "Summarize this" prompt. We also learned the hard way that efficient token management is the difference between a 10-second response and a crash.
What's next for SpecGap
Tinder for Contracts: Building a frontend where users can "Swipe Left" to reject a clause or "Swipe Right" to accept a fix.
JIRA Integration: Automatically turning "Tech Gaps" into JIRA tickets for the engineering team.
Enterprise Memory: Storing past audits so the AI learns a company's specific risk tolerance over time.
Log in or sign up for Devpost to join the conversation.