Project Story

What Inspired Us Millions of people rely on public systems for stability, whether for educational funding, housing support, or emergency disaster services. However, these systems are chronically fragmented, heavily bureaucratic, and incredibly hard to navigate, especially for individuals operating under stress. We observed that students and families—across regions like Malawi, Zambia, and the United States, frequently miss out on life-changing support simply because the eligibility criteria are buried in dense, confusing jargon.

Our mission was to build the AI-Powered Benefits Navigator: a platform that translates complex rules into plain language, acting as a personalized digital caseworker to turn systemic confusion into clarity and actionable next steps.

How We Built It The platform is engineered as a full-stack Next.js application, prioritizing speed, accessibility, and precision. We deliberately avoided a lazy "open-ended chat" interface. Instead, we built a dynamic Conversational Triage Engine. Users state their problem in plain English, and the AI generates a rapid-fire, context-aware multiple-choice UI to gracefully collect the missing pieces of their profile.

Our backend is built on a resilient, two-stage Retrieval-Augmented Generation (RAG) architecture using OpenAI models via Azure/GitHub inference endpoints:

  1. Stage-One Retrieval & Query Expansion: We utilize Pinecone as our vector database. When a user submits a prompt, we embed the text and apply metadata filtering. Because users might type "USA" while the database holds "United States", we built a custom Query Expansion and Soft Fallback engine to guarantee robust matching. The search evaluates the proximity between the user's vectorized profile $U$ and the program constraint vectors $P$ using cosine similarity:

$$S_C(U, P) = \frac{\sum_{i=1}^{n} U_i P_i}{\sqrt{\sum_{i=1}^{n} U_i^2} \sqrt{\sum_{i=1}^{n} P_i^2}}$$

  1. The Dynamic Triage Engine: Using Few-Shot Prompting, we instruct the LLM to cross-reference the retrieved rules against the user's initial statement. It automatically drops questions the user already answered (e.g., if they say "I am a student," it skips enrollment questions) and generates highly empathetic, descriptive multiple-choice questions to fill the gaps.
  2. Stage-Two Context Hydration: Once the assessment concludes, the backend takes the parent_source_id of the winning matches and reaches back into the vector database to extract only the specific application workflows, required documents, and institutional cautions tied to those exact programs. (Note: To accelerate our local testing and model evaluation for this logic, we leveraged the ROCm ecosystem and AMD GPU infrastructure).

Challenges We Faced

  • The 8,000 Token Limit Wall: Our database returns massive amounts of rich data (application steps, institutional cautions, legal rules). Passing all of this into gpt-4o instantly triggered 413 Payload Too Large errors. We solved this by implementing strict top-K slicing and aggressively instructing the LLM to "ruthlessly summarize" the data into concise, 10-word, mobile-friendly bullet points, ensuring we stayed safely within context limits while preserving UI elegance.
  • Brittle Database Filters: Initially, if a user requested "education loans" but the database chunk was tagged "education_support", the system threw a "No Programs Found" error. We engineered an alias-mapping and fallback routing system so the AI dynamically adjusts its search constraints if the strict filters return empty.
  • Responsible AI & Over-Reliance: Our primary ethical challenge was preventing the AI from definitively telling a vulnerable user "you qualify," which could lead to disastrous real-world consequences if a nuance was misinterpreted. We mitigated this by hardcoding strict guardrails. The system forces conditional framing ("You may qualify") and specifically injects mandatory Human Verification Points into the UI, ensuring the user always knows exactly which human authority to contact for the final decision.

What We Learned We learned that when applying data science to civic tech, the architecture of the guardrails is just as critical as the intelligence of the model.

Built With

Share this project:

Updates