💡 Inspiration A friend of mine — a first-year PhD student from China — spent her entire first semester overpaying taxes because she didn't know about her FICA exemption. The answer was buried in IRS Publication 519, page 34. She had asked three people and gotten three wrong answers.
That moment stuck with us. International students aren't unintelligent — they're navigating a system that was never built for them. 1.1 million students arrive in the US every year with no financial orientation. ClearPath is that orientation.
🏗️ How We Built It We built five specialized AI agents orchestrated by LangGraph — each one an expert in a different domain: visa rules, taxes, banking, remittances, and credit building.
A supervisor node routes every user query to the right agent based on keywords. Each agent retrieves relevant chunks from a ChromaDB vector database we built from real IRS publications and USCIS documents, then calls Claude with a domain-specific system prompt. The result is answers grounded in actual regulatory text — not hallucinations.
The Financial Simulator runs the bank and credit agents in parallel with asyncio.gather, feeds both results into a single Claude synthesis prompt, and returns a structured 12-month JSON roadmap in one shot.
For the Bank Finder, we made a deliberate choice to replace the Google Maps API with pre-loaded static branch data for 20 major US university cities — faster, more reliable, and zero ongoing cost.
📚 What We Learned Prompt engineering is architecture. The difference between a useful agent and a hallucinating one isn't the model — it's the system prompt. Constraining each agent to a specific output format (the Tax Agent always ends with a "next step", the Visa Agent always uses ✅ / ❌ / 📋 sections) transformed vague AI outputs into scannable, actionable answers.
We also learned that RAG grounding is non-negotiable for anything compliance-adjacent. Without it, the Tax Agent would confidently give wrong treaty eligibility answers. With it, responses cite specific IRS publication sections.
🚧 Challenges Parsing IRS PDFs — Publication 519 uses multi-column layouts that pypdf extracts as jumbled text. We tuned chunk overlap and retrieval filters to make the RAG usable despite the messy source.
Claude returning markdown inside JSON — The simulator asks for pure JSON, but Claude occasionally wraps it in a code fence. We wrote a _json_from_text() fallback using regex to extract the first {...} block, making the parser resilient without sacrificing quality.
Graceful degradation — Every external call can fail. We built three fallback layers: agent-level hardcoded responses, a deterministic simulator fallback, and a MOCK_MODE that bypasses all API calls entirely. The app never shows a blank screen, even with no API keys configured.
Log in or sign up for Devpost to join the conversation.