The Autonomous AI Agent

LGTM: Legal Governance & Trust Monitor

Inspiration

In the fast-paced world of modern software development, legal compliance is often an afterthought—until it's too late. We've seen brilliant projects derailed by accidental "License Bombs" (like a GPL-3.0 header in a proprietary repo) or unintended PII leaks. We were inspired to build a tool that makes legal safety as automatic as a unit test. We wanted to transform "LGTM" from a casual "Looks Good To Me" into a verified, data-driven "Legal Governance & Trust Monitor."

What it does

LGTM is an autonomous AI Auditor that lives in your GitLab environment. Every time a Merge Request opens, the LGTM agent springs into action:

Scans Diffs: Intelligently filters out binary noise to find source code changes.
Identifies Risks: Detects copyleft licenses, hardcoded secrets, and PII using Context-Injected RAG.
Rule-Cite Library: Cites a versioned library of 9 professional legal briefs (RC-01 to RC-09) to ground its advice in actual law, not hallucinations.
Calculates Risk Scores: Uses a weighted probability model to determine a final compliance score.
Autonomous Remediation: When high-risk violations are found, LGTM doesn't just flag them—it opens a "Remediation Proposal" MR suggesting a compliant code swap.
Legal Memory (Audit Trail): Persists every decision as a signed, queryable record in .lgtm/records/, ensuring a legally defensible audit trail for every release.

How we built it

We built LGTM using a modern, agentic stack:

AI Brain: Gemini 2.5 Flash provided the high-speed reasoning required for complex legal interpretation.
Dual-Retrieval RAG: We implemented a custom RAG architecture that retrieves both Public Briefs (Rule-Cite) and Project Precedents (Legal Memory) before delivering counsel.
GitLab Integration: Built a robust REST API toolset for direct interaction with GitLab Merge Requests, Diffs, and Issues.
Smart Filtering: Implemented a keyword-based priority scan to handle large repositories with 600+ file changes.
Math Modeling: Our Risk Score ($R$) is calculated based on weighted categories: $$R = \min(100, \sum_{i=1}^{n} w_i \cdot c_i)$$ where $w_i$ is the severity weight and $c_i$ is the confidence of detection.

Challenges we ran into

The biggest technical hurdle was the "GitLab Access Barrier." We initially planned to use a beta MCP endpoint, but encountered persistent 403 Forbidden errors. Instead of giving up, we pivoted in real-time to build a comprehensive direct-API fallback that ended up being faster and more flexible. We also faced the "Noise Problem"—GitLab Merge Requests in real-world repos are often cluttered with IDE metadata and binary artifacts. We solved this by building a multi-stage filtering pipeline that allows the AI to ignore the "garbage" and focus on the code that matters.

Accomplishments that we're proud of

Real-time Pivot: Turning a blocked integration into a working, custom REST toolset in just a few hours.
Accuracy: The agent doesn't just find keywords; it understands the implications of a license, such as the copyleft risks of GPL-3.0.
Defensibility: Every report includes a "signed" reasoning record, turning a technical lint check into a legally defensible audit record.
Seamless Integration: The reports look like they were written by a human legal expert, yet they appear seconds after a push.

What we learned

We learned that context is king. An AI that sees every file in a computer is overwhelmed; an AI that sees only the right files is a genius. Refining our filtering logic taught us how to optimize LLM context windows for maximum signal-to-noise ratio. We also learned that the most valuable AI tools are the ones that work within existing developer workflows (like the MR comment section) rather than forcing them into a new dashboard.

What's next for LGTM (Legal Governance & Trust Monitor)

Multi-Cloud Deployment: Moving from local execution to a fully managed Google Cloud Run webhook service.
Semantic License Search: Moving beyond keywords to identify "copied-from-Stack-Overflow" code that might carry hidden license obligations.
Interactive Remediation: Allowing developers to chat with the LGTM agent directly in the MR comments to ask for advice on how to fix a violation.
Enterprise Dashboard: A centralized view for Legal teams to monitor compliance across thousands of repositories in real-time.

LGTM: Making legal compliance as simple as a git push.

Built With

docker
dotenv
fastapi
firestore-scaffolded
gemini-2.5-flash
gitlab-rest-api
googe-cloud-run
google-cloud
google-genai-sdk
httpx
javascript
python
rag
react+vite-dashboard
vertex-ai
yaml

Updates

Fadi Abbas started this project — May 13, 2026 04:47 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.