Project Story

Every business deals with contracts, yet most lack the legal expertise to interpret them accurately. We built ContractIQ to democratize legal contract understanding using OpenAI’s GPT-OSS, making advanced contract analysis accessible to everyone.
Contracts are everywhere—sales agreements, NDAs, vendor contracts—but reading and extracting meaningful information is time-consuming and error-prone. Our goal was to automate this process while maintaining high accuracy and interpretability.

What it does

ContractIQ instantly analyzes contracts to:

  • Extract 20+ critical clauses (termination, liability, warranties, confidentiality, etc.)
  • Highlight exact text locations with span-level precision
  • Answer complex legal questions about contract terms
  • Achieve 83.7% semantic accuracy (BERTScore F1) on legal benchmarks

It’s designed for lawyers, startups, and businesses to quickly understand and evaluate contracts without needing a legal team for every review.

How we built it

Model Architecture

  • Base: GPT-OSS (gpt-2-medium variant)
  • Fine-tuning: LoRA rank 64, 3000 steps on NVIDIA A100-80GB
  • Dataset: 10,000+ samples from CUAD (Contract Understanding Atticus Dataset) and Legal-LAMA
  • Framework: Unsloth for faster training, vLLM for inference

Training Pipeline

# Key hyperparameters
config = {
    "learning_rate": 2e-4,
    "lora_rank": 64,
    "batch_size": 4,
    "gradient_accumulation": 4,
    "warmup_steps": 100,
    "max_steps": 3000
}

## Evaluation & Results

ContractIQ underwent comprehensive testing across:  

- Clause extraction  
- Legal knowledge assessment  
- Complex reasoning  
- Edge case handling  
- CUAD benchmark performance  

**Key metrics:**  

- **Overall Score:** 29.3% (Developing → strong potential)  
- **BERTScore F1:** 0.837  
- **Edge Case Handling:** 100%  
- **Tokens/sec (Inference Speed):** 25.8  

---

## Challenges

- **Domain-specific fine-tuning:** Legal language is nuanced, and datasets are small relative to general NLP corpora.  
- **Efficiency:** Ensuring fast inference for real-world contracts without sacrificing accuracy.  
- **Evaluation:** Crafting benchmarks that reflect practical business and legal scenarios.

Built With

  • bertscore
  • cuad-dataset
  • gradio
  • hugging-face
  • legal-lama
  • lora
  • nvidia-a100
  • peft
  • python
  • pytorch
  • transformers
  • unsloth
  • vllm
Share this project:

Updates