Inspiration AI is reshaping courtrooms, yet no structured, auditable database of AI litigation exists. GWU's DAIL project had raw Excel exports — but no way to search, version, or trust the data. We asked: what would a research-grade database look like if provenance was a first-class citizen, not an afterthought?

What it does DAIL Forge is a full-stack litigation intelligence platform. Researchers can search every US AI lawsuit by court, status, algorithm type, legal issue, or plain-English full-text query. Every edit to the dataset requires a source citation and editor ID — logged permanently. Pipeline runs version the entire dataset as diffs, so you can prove exactly what changed between any two points in time.

How we built it PostgreSQL 15 with GIN full-text search indexes, a FastAPI async backend with three API layers (public research, restricted curation, pipeline control), and a zero-dependency single-page frontend. Data ingests from Excel via a SHA-256 delta loader that only writes rows that actually changed. Docker Compose ties it all together.

Challenges we ran into The DAIL Excel export had Case_Table and Docket_Table as column-definition metadata, not actual case records. We had to reverse-engineer stub cases from foreign-key references inside document rows - building a schema-aware pipeline that handles both structural metadata and real data in the same load.

Accomplishments that we're proud of The provenance enforcement layer - the server physically cannot accept an edit without a source citation or written justification. The change log is append-only at the application level. Every snapshot diff is field-level, reproducible, and exportable. The whole audit trail survives any future data import.

What we learned Research-grade data isn't just about accuracy - it's about traceability. A fact without a source is a liability. Building the citation requirement into the API contract rather than as a UI suggestion made the difference between a polished tool and a trustworthy one.

What's next for DAIL Forge: Research grade AI Litigation Backend Ingest real PACER docket data to replace stub cases. Add an AI-assisted curation copilot that suggests edits with auto-generated citations. Expand to track EU AI Act enforcement actions alongside US litigation. Open the public research API to GWU affiliates and partner institutions.

Share this project:

Updates