Inspiration
We’ve all clicked "I Agree" without reading the fine print. Companies know this, hiding aggressive data policies behind dense "legalese" that acts as a barrier to understanding. We realized that simply summarizing a document isn't enough because summaries often miss the nuance. We wanted to build LexiGuard to act as a decoder ring—specifically designed to hunt down the "Hidden Keywords" (sneaky legal terms like indemnification, binding arbitration, and perpetual license) that companies use to trick users, and translate them into human-readable language.
What it does
LexiGuard is an AI Legal Agent that translates "Legalese" into "Human":
Finds the Hidden Keywords: It scans documents for specific trigger words—legal traps that are often buried in walls of text.
Contextual Analysis: It doesn't just spot the keyword; it checks how it's used. (e.g., Is "data sharing" for shipping a package? Or for selling to advertisers?)
Human-Readable Translation: It rewrites these complex clauses into 8th-grade English. Instead of saying "User indemnifies platform," it says "You have to pay their legal bills if they get sued."
Compliance Scoring: It cross-references these keywords against a DigitalOcean-hosted Knowledge Base of GDPR and CCPA laws to assign a simple 0-10 safety score.
How we built it
We utilized the DigitalOcean GenAI Platform to create a specialized RAG (Retrieval-Augmented Generation) pipeline:
The Keyword Detector: We engineered a System Prompt that prioritizes a list of 50+ "high-risk" legal keywords (e.g., waiver, third-party, affiliates).
The Knowledge Base: We uploaded a custom legal_definitions.md file to the DigitalOcean agent. This acts as a dictionary, teaching the AI exactly how to translate specific hidden keywords into plain English without hallucinating.
The Model: We used GPT-oss-120b, instructing it to act as a "Translator" rather than a lawyer, ensuring the output is always simple and conversational.
Challenges we ran into
False Positives: Initially, the AI flagged every mention of "data" as bad. We had to refine our "Hidden Keyword" logic to differentiate between functional data usage (good) and commercial data selling (bad).
Simplification vs. Accuracy:
It was hard to make the AI sound "human" without losing legal accuracy. We solved this by implementing a "Two-Step" prompt: first, extract the legal fact; second, rewrite it for a 12-year-old.
Hallucinations: To stop the AI from inventing fake laws, we restricted its answers strictly to the provided Knowledge Base documents. Can be found in the attached documents.
Accomplishments that we're proud of
The "Plain English" Engine: We successfully tuned the agent to take a 500-word liability clause and turn it into a single, understandable sentence: "If you break it, you buy it."
Keyword Extraction: The agent accurately identifies 95% of hidden predatory clauses in our test set of standard EULAs.
Speed: The entire analysis happens in under 5 seconds, making it faster than skimming the first paragraph yourself.
What we learned
Language is a Barrier: The biggest issue with modern tech isn't the technology; it's the language used in contracts. AI is the perfect tool to bridge that gap.
RAG is Essential: You cannot rely on a model's general knowledge for law. Injecting specific definitions for "Hidden Keywords" was crucial for consistent results.
What's next for LexiGuard
Browser Extension: A popup that automatically highlights "Hidden Keywords" in red as you scroll through a webpage.
Multi-Language Support: Translating English legalese into plain Spanish, French, and German to help international users.
"Fix It" Button: An AI agent that not only finds the bad keywords but automatically drafts an email to the company asking to opt-out of those specific terms.
Built With
- agent
- ai
- digitalocean
- javascript
Log in or sign up for Devpost to join the conversation.