LGTM: Legal Governance & Trust Monitor
Inspiration
In the fast-paced world of modern software development, legal compliance is often an afterthought—until it's too late. We've seen brilliant projects derailed by accidental "License Bombs" (like a GPL-3.0 header in a proprietary repo) or unintended PII leaks. We were inspired to build a tool that makes legal safety as automatic as a unit test. We wanted to transform "LGTM" from a casual "Looks Good To Me" into a verified, data-driven "Legal Governance & Trust Monitor."
What it does
LGTM is an autonomous AI Auditor that lives in your GitLab environment. Every time a Merge Request opens, the LGTM agent springs into action:
- Scans Diffs: Intelligently filters out binary noise to find source code changes.
- Identifies Risks: Detects copyleft licenses, hardcoded secrets, and PII using Context-Injected RAG.
- Rule-Cite Library: Cites a versioned library of 9 professional legal briefs (RC-01 to RC-09) to ground its advice in actual law, not hallucinations.
- Calculates Risk Scores: Uses a weighted probability model to determine a final compliance score.
- Autonomous Remediation: When high-risk violations are found, LGTM doesn't just flag them—it opens a "Remediation Proposal" MR suggesting a compliant code swap.
- Legal Memory (Audit Trail): Persists every decision as a signed, queryable record in
.lgtm/records/, ensuring a legally defensible audit trail for every release.
How we built it
We built LGTM using a modern, agentic stack:
- AI Brain: Gemini 2.5 Flash provided the high-speed reasoning required for complex legal interpretation.
- Dual-Retrieval RAG: We implemented a custom RAG architecture that retrieves both Public Briefs (Rule-Cite) and Project Precedents (Legal Memory) before delivering counsel.
- GitLab Integration: Built a robust REST API toolset for direct interaction with GitLab Merge Requests, Diffs, and Issues.
- Smart Filtering: Implemented a keyword-based priority scan to handle large repositories with 600+ file changes.
- Math Modeling: Our Risk Score ($R$) is calculated based on weighted categories: $$R = \min(100, \sum_{i=1}^{n} w_i \cdot c_i)$$ where $w_i$ is the severity weight and $c_i$ is the confidence of detection.
Challenges we ran into
The biggest technical hurdle was the "GitLab Access Barrier." We initially planned to use a beta MCP endpoint, but encountered persistent 403 Forbidden errors. Instead of giving up, we pivoted in real-time to build a comprehensive direct-API fallback that ended up being faster and more flexible. We also faced the "Noise Problem"—GitLab Merge Requests in real-world repos are often cluttered with IDE metadata and binary artifacts. We solved this by building a multi-stage filtering pipeline that allows the AI to ignore the "garbage" and focus on the code that matters.
Accomplishments that we're proud of
- Real-time Pivot: Turning a blocked integration into a working, custom REST toolset in just a few hours.
- Accuracy: The agent doesn't just find keywords; it understands the implications of a license, such as the copyleft risks of GPL-3.0.
- Defensibility: Every report includes a "signed" reasoning record, turning a technical lint check into a legally defensible audit record.
- Seamless Integration: The reports look like they were written by a human legal expert, yet they appear seconds after a push.
What we learned
We learned that context is king. An AI that sees every file in a computer is overwhelmed; an AI that sees only the right files is a genius. Refining our filtering logic taught us how to optimize LLM context windows for maximum signal-to-noise ratio. We also learned that the most valuable AI tools are the ones that work within existing developer workflows (like the MR comment section) rather than forcing them into a new dashboard.
What's next for LGTM (Legal Governance & Trust Monitor)
- Multi-Cloud Deployment: Moving from local execution to a fully managed Google Cloud Run webhook service.
- Semantic License Search: Moving beyond keywords to identify "copied-from-Stack-Overflow" code that might carry hidden license obligations.
- Interactive Remediation: Allowing developers to chat with the LGTM agent directly in the MR comments to ask for advice on how to fix a violation.
- Enterprise Dashboard: A centralized view for Legal teams to monitor compliance across thousands of repositories in real-time.
LGTM: Making legal compliance as simple as a git push.
Built With
- docker
- dotenv
- fastapi
- firestore-scaffolded
- gemini-2.5-flash
- gitlab-rest-api
- googe-cloud-run
- google-cloud
- google-genai-sdk
- httpx
- javascript
- python
- rag
- react+vite-dashboard
- vertex-ai
- yaml

Log in or sign up for Devpost to join the conversation.