Legal Aid

Inspiration

We’ve all experienced it: facing a wall of legal text in Terms & Conditions and instinctively clicking “I Agree” without reading. These documents are often long, dense, and filled with legal jargon making it nearly impossible for everyday users to understand what they’re agreeing to.

In an age where online safety and data privacy are increasingly under threat, this poses a serious risk. We were inspired to build a solution that empowers users to make informed decisions by summarizing complex T&Cs into something simple, readable, and actionable.

What it does

Legal Aid is a two-step pipeline that transforms lengthy, jargon-heavy legal agreements into short, understandable summaries. Given any Terms & Conditions input, it:

Extracts the text under Terms of Services and Terms and Conditions.
Feeds them into a fine-tuned Llama 3.1-8b-instruct model with a propmt to extract negative implications.

The final output is a concise, readable overview of the key points helping users know what they’re agreeing to before clicking "Accept."

How we built it

We used a two-stage NLP pipeline:

Extractive Summarization
Using Selenium and webscraping, we extract sentences from the document.
Abstractive Summarization
These extracted sentences are then passed through the LLama model from Hugging Face to summarize T&Cs in a clear, user-friendly format.

Tech Stack:

JavaScript
Manifest V3 Extension
Hugging Face Llama 3.1-8b-instruct model
HTML
CSS
Flask

Challenges we ran into

Token Limitations: Many T&Cs exceed the input token limit for transformer models. We had to preprocess, chunk, or truncate input text while preserving meaning.
Legal Language Complexity: Accurately simplifying legal text without losing critical meaning was a fine balance.
Evaluation: Since T&Cs are often vague or open to interpretation, evaluating the quality of summaries was difficult.
Latency: The summarization process can take time especially with large inputs or API latency from hosted models.

Accomplishments that we're proud of

Successfully combined extractive and abstractive summarization into a working pipeline.
Leveraged a real-world model model to ensure practical, ethical output.
Created summaries that retained meaning, clarity, and privacy-relevant content from real T&Cs.
Helped make legal documents accessible to non-experts promoting digital literacy and safety.

What we learned

Hands-on experience with natural language processing pipelines combining multiple summarization strategies.
The real-world limitations of deploying transformer models on large documents.
The importance of ethical AI when dealing with privacy, legal, and user-facing applications.
Strategies for handling legal text, which is often intentionally vague or verbose.

What's next for T&C Summarizer

Build a browser extension or web app for real-time summarization on websites.
Support multi-language T&Cs for broader accessibility.
Add voice summaries for audio accessibility.
Integrate with cybersecurity tools to flag risky or unusual clauses.
Experiment with newer transformer models like Longformer or GPT-4-turbo for longer document support.

Built With

css
flask
html
manifestv3extension
node.js
python

Updates

Ishan Shah started this project — May 28, 2025 11:27 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.