Project Story
Every business deals with contracts, yet most lack the legal expertise to interpret them accurately. We built ContractIQ to democratize legal contract understanding using OpenAI’s GPT-OSS, making advanced contract analysis accessible to everyone.
Contracts are everywhere—sales agreements, NDAs, vendor contracts—but reading and extracting meaningful information is time-consuming and error-prone. Our goal was to automate this process while maintaining high accuracy and interpretability.
What it does
ContractIQ instantly analyzes contracts to:
- Extract 20+ critical clauses (termination, liability, warranties, confidentiality, etc.)
- Highlight exact text locations with span-level precision
- Answer complex legal questions about contract terms
- Achieve 83.7% semantic accuracy (BERTScore F1) on legal benchmarks
It’s designed for lawyers, startups, and businesses to quickly understand and evaluate contracts without needing a legal team for every review.
How we built it
Model Architecture
- Base: GPT-OSS (gpt-2-medium variant)
- Fine-tuning: LoRA rank 64, 3000 steps on NVIDIA A100-80GB
- Dataset: 10,000+ samples from CUAD (Contract Understanding Atticus Dataset) and Legal-LAMA
- Framework: Unsloth for faster training, vLLM for inference
Training Pipeline
# Key hyperparameters
config = {
"learning_rate": 2e-4,
"lora_rank": 64,
"batch_size": 4,
"gradient_accumulation": 4,
"warmup_steps": 100,
"max_steps": 3000
}
## Evaluation & Results
ContractIQ underwent comprehensive testing across:
- Clause extraction
- Legal knowledge assessment
- Complex reasoning
- Edge case handling
- CUAD benchmark performance
**Key metrics:**
- **Overall Score:** 29.3% (Developing → strong potential)
- **BERTScore F1:** 0.837
- **Edge Case Handling:** 100%
- **Tokens/sec (Inference Speed):** 25.8
---
## Challenges
- **Domain-specific fine-tuning:** Legal language is nuanced, and datasets are small relative to general NLP corpora.
- **Efficiency:** Ensuring fast inference for real-world contracts without sacrificing accuracy.
- **Evaluation:** Crafting benchmarks that reflect practical business and legal scenarios.
Log in or sign up for Devpost to join the conversation.