Inspiration

Millions of Americans, including gig workers, immigrants, veterans, single parents, and the unbanked, are told to "build an emergency fund" and "pay off debt" without ever being told which one first, or why. The Financial Order of Operations exists, but it assumes you have a salary, a bank account, and a credit history. We built PennyPath for everyone that framework quietly ignores.

What It Does

PennyPath takes 8 questions about your real financial situation and returns a mathematically ordered, personalized action plan.

  • A GraphRAG engine traverses a 4,194-entity knowledge graph built from verified CFPB and FEMA documents to find guidance specific to your persona (veteran, gig worker, immigrant, etc.)
  • A deterministic FOO engine orders your steps by actual financial math, because \(28\%\) APR credit card debt always beats a \(0.5\%\) savings account
  • Gemini personalizes the language so it feels human
  • A What-If simulator lets you instantly see how your plan changes if you lose your job, pay off a debt, or open a bank account, in about 50ms with no re-query needed

How We Built It

  • Knowledge Graph: GraphRAG (Microsoft) + Gemini 2.5 Flash extracted 4,194 entities and 8,066 relationships from 26 real documents (CFPB toolkits, FEMA guides, State Farm articles)
  • Graph Traversal: Custom query_graph.py maps user answers to entity clusters and detects personas from which corpus nodes the traversal touches
  • FOO Engine: 10 deterministic rules in foo_engine.py order the steps by financial math, with State Farm products embedded at the mathematically correct position
  • Personalization: A single Gemini 2.5 Flash call rewrites step labels in warm, specific language, echoing the user's own words from their free-text response
  • Frontend: HTML, Tailwind CSS, vis.js, and vanilla JS with hop-by-hop GraphRAG traversal animation, interactive knowledge graph, and What-If diff badges
  • Backend: Python, Flask, NetworkX, and NumPy powering 7 API endpoints. NetworkX handles all graph traversal and Louvain community detection, while NumPy drives the cosine similarity calculations for semantic search

Challenges We Ran Into

  • State Farm PDF authentication: Scraping State Farm's Simple Insights articles hit auth walls, which cost us significant time early in the data pipeline and forced us to find a workaround to keep the knowledge graph grounded in real source material
  • GraphRAG indexing time: Mapping 634K characters across 26 documents to entities, relationships, and 21 Louvain communities was the single most time-consuming step. Running on free-tier Gemini keys meant juggling multiple API keys and hitting rate limits constantly, which broke the pipeline mid-run more than once
  • Choosing the right Gemini model: Gemini 2.5 Pro was too slow and expensive for real-time traversal; 2.5 Flash hit the right balance of speed, quality, and cost, but it took several failed runs to land there
  • GraphRAG traversal animation: Making the hop-by-hop path illumination feel smooth (nodes scaling \(3.2\times\), edge glow, breadcrumb trail) in vis.js without the graph layout jumping or flickering was genuinely the hardest frontend problem we faced
  • FOO rule conflicts: Edge cases where two rules applied simultaneously (e.g., savings under $500 AND high APR debt) required careful ordering logic so the engine did not contradict itself
  • Persona overlap: A user who is both a veteran and a single parent touches two entity clusters; making multi-persona traversal additive rather than conflicting took real iteration

Accomplishments That We're Proud Of

  • A knowledge graph built entirely from real, verified sources including CFPB toolkits, FEMA emergency guides, and State Farm articles, totaling 26 documents and 634K characters, with zero fabricated financial data. Every entity, relationship, and community in the graph traces back to a document a real financial institution published
  • Nine distinct underserved populations served in a single app, each with a genuinely different plan output
  • A What-If simulator that reruns in about 50ms by decoupling the FOO engine from GraphRAG
  • Persona detection from graph traversal, and not just self-selection. The graph tells us who you are based on which nodes your answers touch

What We Learned

  • GraphRAG is powerful but operationally brutal; rate limits, indexing time, and model selection decisions compound fast under a 48-hour clock
  • Separating the GraphRAG layer, the FOO engine, and the personalization layer into three distinct modules taught us that AI pipelines are most reliable when the LLM is doing the smallest possible job. In our case, Gemini only touches language, never logic
  • Deterministic rule engines and LLMs are stronger together than either alone; the math keeps the output defensible, and the LLM keeps the language approachable and specific to your situation
  • Building for 9 underserved personas meant putting ourselves in each person's shoes before writing a single line of code; generic financial advice is not just unhelpful for these populations, it can actively mislead them, and we learned that responsible financial guidance means knowing when to withhold a recommendation as much as when to make one

What's Next for PennyPath

  • Expand the knowledge graph with state-specific benefit data (Medicaid, SNAP, housing assistance) so recommendations are geographically accurate
  • Multilingual support: Many of our target personas, including recent immigrants and international students, face an additional barrier when financial guidance is only available in English; adding multilingual support is a natural next step and Gemini makes this feasible
  • Progress tracking: Let users mark steps complete and watch their graph update in real time
  • Open banking integration: Pull real account balances to pre-fill intake answers and make the FOO engine live

Built With

Share this project:

Updates