Inspiration

Every year, 600,000 people walk out of U.S. prisons and into a world that wasn't designed to welcome them back. One in 55 Americans has a felony conviction, and each face a web of federal restrictions and re-entry barriers after release.

The numbers show systematic abandonment:

  • 27% unemployment among formerly incarcerated people — eight times the national average. More than half can't find stable work within their first year out.
  • More than half can't find stable work within their first year out. 10x more likely to be homeless.
  • In large cities, up to 50% of people leaving prison have no stable housing waiting for them. Two out of three are rearrested within three years. Within ten years, that number climbs to 82%.

Most "AI for social good" projects address this problem with a chatbot skin over an LLM that hallucinates eligibility rules and tells you to "contact your local office for more information." So we took a different approach; in a landscape of shiny mocked and stubbed UIs, we decided to put our heads down and just build actually useful integrations. We wanted to die on the hill of testing and iterating during this hackathon.

This project is simple and "unsexy" for a reason. Every feature you see prioritizes reliability and function, and is a tool or subagent that is tested to perform actual tasks for you.

VCs are pouring money into agents to perform similar workflows in enterprise settings. But that same level of investment hasn’t reached reentry community centers or caseworkers, who face equally complex coordination challenges. At YHack, we wanted to explore what it would look like to apply this technology in that context. The leverage of agents is undeniable: we wanted to direct toward a problem that’s too often overlooked.

What it does

Threshold is a local-first, trauma-informed AI assistant that helps people navigate re-entry after incarceration. It combines deep domain knowledge with a multi-agent architecture to provide real, actionable guidance across five critical areas:

Housing — Searches real HUD housing counselor databases, SAMHSA recovery housing listings, and 211 community resource APIs. Includes a hand-researched database of Connecticut reentry housing programs that actually accept people with records — with real phone numbers, real addresses, real intake requirements. Tracks applications through 14 stages (from discovery to move-in). Looks up fair-chance housing laws by state. Accesses HUD fair market rent data for voucher holders.

Employment — Searches live job listings via the Adzuna API with awareness of ban-the-box laws across 16+ states. Maintains a database of 150+ verified second-chance employers (Walmart, Amazon, FedEx, Goodwill, and more) with proven fair-chance hiring practices. Generates tailored resumes with forward-looking conviction disclosure strategies. Tracks applications from submission through offer with an 11-stage pipeline.

Benefits — Calculates Connecticut-specific eligibility for SNAP, Medicaid (HUSKY A/B/C/D), and Medicare Savings Programs using real 2026 federal poverty guidelines and 7 CFR 273 rules. This is deterministic Python logic, not an LLM guessing — gross income tests, net income tests, earned income deductions, utility allowances, dependent care deductions, all encoded as code. Accounts for drug felony opt-out states and estimated monthly benefit amounts.

Legal Navigation — Tracks parole and probation conditions with proactive reminders ("Your Friday check-in is tomorrow"). Provides state-specific ID restoration guides (birth certificate, Social Security card, state ID). Checks expungement eligibility by state and offense category.

Government Form Auto-Fill — Using Claude's computer use capability with a real Browserbase remote browser, Threshold pre-fills .gov forms with your profile data. Not a screenshot. Not a simulation. A real browser session you can watch live as it types into actual government websites. You review everything and click submit yourself — the AI never submits for you. Only government domains are allowed.

Voice Intake Interview — A real-time voice conversation powered by Pipecat, Deepgram speech-to-text, and ElevenLabs text-to-speech. Grounded in motivational interviewing (MI) techniques — open questions, affirmations, reflections, summaries. An engagement tracker monitors response latency and length in real-time to adapt the conversation's emotional tone. After the interview, the system generates a person-centered summary, highlight reel, and care plan seed.

Document Intelligence — Upload a photo of a court order, SNAP approval letter, or parole document, and Gemini 2.5 Flash extracts structured data via OCR, then maps it to your profile fields.

Always-On Protections:

  • Crisis protocol — If a user expresses suicidal ideation or acute emotional crisis, the system immediately surfaces 988, Crisis Text Line, and SAMHSA resources. No delegation, no delay, no exceptions.
  • Privacy by design — The system never references your conviction or offense unless you bring it up first. When disclosure is necessary (like for a ban-the-box application), it's framed as forward-looking, never apologetic.

The frontend is a responsive web app with a real-time streaming chat interface, document vault with OCR upload, and dedicated dashboards for housing, employment, and benefits. Designed for budget smartphones: 48px touch targets, 18px base font size, works on a $50 Android phone with intermittent connectivity.

The frontend is a responsive web app with a real-time streaming chat interface over WebSocket, document vault with OCR upload, and dedicated dashboards for housing, employment, and benefits. Designed for budget smartphones: 48px touch targets, 18px base font size, works on a $50 Android phone with intermittent connectivity.

And none of this is mocked!

Every API call in Threshold hits a real endpoint. Every tool the agent invokes works.

Feature What's actually happening
Job search Live Adzuna API queries returning real listings, filtered through a 150+ employer fair-chance database
Housing search Real HUD Housing Counselor API + SAMHSA treatment locator + 211 community resources + hand-researched reentry programs
Benefits eligibility Deterministic Python engine with 2026 FPL thresholds, all SNAP deduction categories, Medicaid pathway logic
Form auto-fill Real Browserbase browser session with Claude computer use — you can watch it type into .gov forms live
Voice interview Real Deepgram STT → LLM → ElevenLabs TTS pipeline over WebRTC, with MI-grounded conversation flow
Document OCR Real Gemini 2.5 Flash extraction → schema mapping pipeline
Chat Real FastAPI backend with WebSocket streaming, not a static demo
Multi-agent routing Real LangGraph orchestrator dispatching to 7 specialized subagents, each backed by a real LLM

How we built it

We designed a 4-tier capability model to avoid the trap of throwing an LLM at every problem:

  1. System prompt (free) — Trauma-informed principles, routing rules, crisis protocol
  2. @tool functions (deterministic) — Eligibility calculations, database lookups, supervision tracking. Fast, reliable, no hallucination risk.
  3. Markdown workflows (guided generation) — Cover letters, resumes, housing application letters. The agent reads step-by-step instructions and applies its own reasoning.
  4. Full subagents (autonomous) — multi-step tasks like housing search (search, filter, match, draft, track) that require their own planning. Allows the model to complete entire workflows independently.

Multi-model orchestration. Different tasks need different models, so we split the system by capability instead of forcing one model to do everything. The main orchestrator runs on xAI Grok 4.1. Fast as the default model for routing, delegation, and most domain subagents because it is fast and cost-efficient for high volume agent turns. Nearly all of the specialist subagents also utilize Grok 4.1. including benefits, housing, employment, legal, community, housing, and supervision. For browser-based government form completion, we switched to Anthropic Claude Sonnet 4.6 because it is necessary suited computer use workflows that require screenshot-based reasoning, UI navigation, and careful multi-step interaction. LangGraph and deepagents coordinate this agent architecture, manage state, preserve checkpoints, and let these subagents work together inside one system without collapsing everything into a single monolithic agent. For document OCR and schema mapping, we use Google Gemini 2.5 Flash to extract structured information from uploaded files before writing it into the user profile. We also built a separate voice intake pipeline that combines Deepgram for speech-to-text, OpenAI GPT 4o-mini for live interview reasoning and tool use, and ElevenLabs for text-to-speech.

Backend: Python 3.13, FastAPI, LangGraph + deepagents for multi-agent orchestration, SQLite with Fernet encryption at rest.

Frontend: React 19 + TypeScript + Vite, TailwindCSS v4, Zustand for state management, WebSocket streaming with exponential backoff reconnection. Material Symbols icons. Framer Motion for subtle animations.

Trauma-informed design is core to in every decision, from using amber instead of red for alerts (red means "violation" to someone on parole) to explaining why before asking sensitive questions. The UI feels homey and warm, not like logging into another government portal.

Challenges we ran into

Encoding legal complexity as data, not prompts. SNAP eligibility alone involves federal drug felony bars with 23 state opt-outs, gross and net income tests with multiple deduction categories, and household size calculations. We had to encode all of this as deterministic Python logic — one hallucinated eligibility answer could cost someone months of food assistance.

Making crisis response un-delegable. In a multi-agent system, the natural pattern is to route everything through subagents. We had to ensure the orchestrator catches crisis signals before any delegation happens and short-circuits the entire agent pipeline.

Multi-provider model orchestration. We use four different LLM providers (Anthropic, Google, xAI, and Browserbase for computer use). Each has different APIs, rate limits, and failure modes. Getting them to work together seamlessly through LangGraph required careful error handling and fallback logic.

Designing for the actual user. Our target user might be on a bus with a cracked-screen budget Android phone and a prepaid data plan. Every design decision from the 48px touch targets to the 18px base font to the offline-capable architecture supports a specific user experience.

Government form safety. Browser automation is inherently nondeterministic due to dynamic UIs, latency, and race conditions, especially on government sites. In a high stakes context, small errors can delay access to critical services. We mitigate this with a strict .gov allowlist, no auto submit, and full user-in-the-loop review.

Accomplishments that we're proud of

  • 17+ domain-specific tools with real eligibility rules, real program databases, and real legal frameworks — not a chatbot wrapper that says "consult a professional"
  • Everything actually works. Real API calls, real eligibility math, real browser automation, real voice pipeline. No mocks, no stubs, no placeholder data masquerading as features
  • Crisis protocol that interrupts everything. No edge case, no delegation path, no prompt injection can prevent a person in crisis from seeing 988 and SAMHSA resources immediately
  • Concrete, actionable guidance. "Call Hartford Housing Authority at (860) 723-8400 and ask about the Project-Based Voucher waitlist" — not "look into housing programs in your area" -Action-oriented. — instead of just telling you what to do, Threshold interfaces with APIs and the browser to sign up for benefits, jobs, and housing entirely on its own.

  • We built a full-stack, multi-agent, multi-model AI application with encrypted storage, real-time streaming, multiple specialized subagents, and a responsive frontend in under 24 hours!

What we learned

  • Re-entry support is shockingly fragmented. There is no single source of truth for what a returning citizen is eligible for. Rules vary by state, county, offense category, time since release, and dozens of other factors.
  • Trauma-informed design changes everything. Small choices can fundamentally change how a system feels.
  • Not every task needs an LLM. Our 4-tier model taught us that the cheapest, fastest, most reliable answer is often a Python function, not a language model. Eligibility rules should be code. Writing tasks should be guided. Only genuinely complex, multi-step reasoning needs a full agent.

What's next

  • Expand beyond Connecticut — The architecture is state-aware by design but adding new states is just a matter of adding eligibility rules and program databases
  • Push notifications for supervision reminders, application deadlines, and follow-up alerts
  • End-to-end encrypted cloud sync for multi-device access without compromising the local-first privacy model
  • Partnerships with Yale Prison Initiative for real-world testing and feedback

Built With

  • anthropic-claude
  • browserbase
  • deepagents
  • fastapi
  • framer-motion
  • google-gemini
  • langgraph
  • python
  • react
  • sqlite
  • tailwindcss
  • typescript
  • vite
  • websocket
  • xai-grok
Share this project:

Updates