Inspiration

Imagine fleeing your home country, only to be handed a dense, 50-page legal decision dictating your future. Case came from a simple problem: refugee claimants and their supporters often face legal jargon that is impossible to decode quickly. We wanted to make Canadian refugee law accessible by turning complex RPD/RAD case law into clear, structured guidance that people can actually use.

What it does

Case is a RAG-powered legal assistant for Canadian refugee law. A user describes their situation in plain language, and Case returns:

  • A plain-language explanation of the legal issue
  • Relevant case law pulled directly from the dataset
  • An evidence checklist to help prepare their claim
  • Source citations for every claim to ensure transparency
  • A clear disclaimer stating "This is not legal advice"

How we built it

We built Case as a full-stack system:

  • Frontend: Next.js + TypeScript + Tailwind for a clean, intuitive user experience.
  • Backend: FastAPI endpoint (/ask) for smooth orchestration.
  • Retrieval pipeline: LangChain + Snowflake Cortex Search for highly accurate semantic document retrieval.
  • Data: Filtered the A2AJ Canadian case law dataset to only RPD/RAD decisions, processing [Insert Number] records.
  • Preparation flow: Cleaned the data, chunked it into overlapping text segments, loaded it into Snowflake, and indexed it in a Cortex Search service.
  • Generation: Gemini is prompted to produce a structured JSON response grounded only in the retrieved documents.

Challenges we ran into

  • Grounding vs. fluency: Keeping responses helpful while strictly preventing uncited or hallucinated claims, which we managed through rigid system prompting.
  • Data quality: Cleaning, filtering, and shaping raw legal decisions into retrieval-friendly chunks required iterative testing to get the segment overlap just right.
  • Schema consistency: Enforcing one reliable response format across varied user questions, a hurdle we cleared by strictly enforcing JSON schema generation in Gemini.
  • End-to-end integration: Connecting frontend, backend, retrieval, and generation reliably under hackathon time constraints meant we had to test modularly before our final merge.

Accomplishments that we're proud of

We didn't just build a generic chatbot, we built a trustworthy legal information tool with strict safety constraints for a high-impact domain. Processing a raw legal corpus into production-style retrieval data was a massive hurdle, and successfully setting up Snowflake Cortex Search to validate semantic retrieval proves this architecture can work in the real world. Designing a citation-first response format establishes a strong base for genuine utility.

What we learned

  • Domain-specific RAG quality depends heavily on data prep and chunking strategy.
  • Strong prompt constraints and output schemas are critical for trust in legal contexts.
  • Retrieval quality and metadata design directly influence user confidence.
  • “Not legal advice” isn’t enough alone, you need transparent citations and clear boundaries in every response.

What’s next for Case

  • Finish and harden the /ask flow end-to-end (frontend ↔ backend ↔ retrieval ↔ generation).
  • Add stronger citation validation and fallback handling for low-retrieval-confidence queries.
  • Improve UX for follow-up questions and document/evidence organization.
  • Expand beyond RPD/RAD over time while preserving strict grounding guarantees.
  • Add evaluation/benchmarking (accuracy, citation coverage, hallucination rate, and usefulness with real user scenarios).

Built With

Share this project:

Updates