AI Litigation Atlas

Inspiration

We wanted to move beyond simply counting AI litigation cases and instead understand two deeper questions: where is litigation accelerating, and which legal themes are emerging over time?

The Duke Law AI Litigation Database (DAIL) provides rich case-level data, but geographic patterns and vocabulary shifts are difficult to identify without manually reviewing hundreds of entries. Inspired by the hackathon’s call for an interactive AI litigation map, we built a tool that surfaces these trends clearly and responsibly, helping researchers, policymakers, and the public explore evolving AI-related legal risk without needing to parse raw spreadsheets.

What It Does

AI Litigation Atlas is an interactive US map and analytics dashboard built on real DAIL cases (2018 onward).

It allows users to:

View a choropleth map colored by:
- Total cases
- Acceleration (whether litigation in a state is increasing faster or slowing down)
Filter by:
- Year range
- Industry (e.g., Healthcare, Finance, Employment)
- Claim type (e.g., Bias/Discrimination, Privacy, Copyright/IP)
Drill into any state to see:
- Yearly counts
- Top industries and claim types
- Sample cases with links to DAIL
Explore emerging themes using TF-IDF language drift analysis.
Export the filtered dataset as JSON.

This tool is for exploratory research only and does not provide legal advice.

How We Built It

Data

Used the official DAIL Case Table Excel export.
Filtered to US cases from 2018 onward.
Extracted filing year.
Normalized jurisdictions into US state codes.
Derived industry and claim type using transparent keyword-based mapping.

Backend

Built with FastAPI (Python).
Exposes REST endpoints for:
- US-wide summaries
- State-level totals
- Acceleration metric
- Language drift analysis
Acceleration = growth in recent half − growth in older half.
Drift uses scikit-learn’s TfidfVectorizer (unigrams + bigrams).

Frontend

React + TypeScript + Vite + TailwindCSS
Map: react-leaflet
Right-side panel tabs:
- Overview
- Filters
- Emerging Themes
- Methodology
- Export

Challenges We Ran Into

Data normalization: Mixed formats in DAIL Excel required jurisdiction cleanup and category mapping.
Defining acceleration: Needed an interpretable metric that handled sparse counts and avoided misleading spikes.
Emerging themes clarity: Required filtering rare terms and showing evidence excerpts.
Repository setup: Ensured no raw CSV or sensitive data was committed.

Accomplishments We're Proud Of

Built entirely on 237+ real US DAIL cases, not synthetic data.
Combined geographic acceleration and semantic drift in one system.
Delivered an end-to-end solution from raw export to interactive dashboard.
Included transparent methodology to prevent black-box interpretation.

What We Learned

Interpretable statistical methods build more trust than complex predictive models in legal contexts.
Data cleaning and schema mapping require more effort than analytics.
Vocabulary shifts reflect legal focus changes, not legal outcomes.
Clear disclaimers are essential when presenting legal analytics.