Inspiration

The spark came from a simple, devastating truth: $60 billion in life-saving benefits go unclaimed every year—not because people don't qualify, but because the forms are too hard to complete.

We watched single parents sacrifice paid work for 20+ hours of paperwork. We saw elderly citizens give up on benefits they desperately needed because the "legalese wall" was insurmountable. We witnessed non-English speakers completely locked out of government services that were designed to help them.

The breaking point? Realizing that wealthy people never face this problem. They have lawyers, accountants, and personal assistants to handle bureaucracy. Everyone else is left to navigate the maze alone.

We asked ourselves: What if we could democratize that entourage? What if every citizen had a tireless advocate working for them 24/7?

That's when Arlis was born—not as a chatbot, but as an autonomous social worker that fights bureaucracy so you don't have to.


What it does

Arlis is an AI-powered autonomous social worker that transforms government bureaucracy from a nightmare into a simple conversation.

Core Capabilities

Visual Intelligence (Gemini Vision)

  • Reads crumpled letters, blurry PDFs, and low-quality scans that break standard OCR
  • Extracts structured data from messy documents (W-2s, pay stubs, benefit letters)
  • Builds a secure "Life Vault" of your identity documents—never hunt for a birth certificate again

Strategic Planning (Gemini 3 Pro)

  • Analyzes complex multi-step government applications
  • Creates execution plans with conditional logic and error handling
  • Reads 50-page policy PDFs to determine eligibility automatically

Autonomous Execution (Browser Automation)

  • Navigates government portals autonomously using Playwright + Stagehand
  • Fills forms, uploads documents, and submits applications
  • Self-corrects when errors occur—no human intervention needed

Real-Time Transparency (Supabase Realtime)

  • Live progress updates as Arlis works in the background
  • Screenshot previews of what Arlis is doing
  • Complete audit trail of every action taken

Universal Accessibility

  • Translates complex legal requirements into plain language in 35+ languages
  • Voice interface for non-English speakers and accessibility needs
  • Works across federal, state, and local government portals

The 4-Phase Sprint

Arlis orchestrates autonomous execution through a sophisticated workflow:

  1. Hydration - Gathers context from documents, conversations, and memory
  2. Planning - Strategizes multi-step approach with error handling
  3. Execution - Fills forms, navigates portals, submits applications
  4. Verification - Confirms accuracy and flags any issues

Real-World Impact

From 20 hours to 60 seconds

  • SNAP benefits application: 3-5 hours to 2 minutes
  • DMV license renewal: 1-2 hours to 90 seconds
  • Immigration forms: Days/weeks to Autonomous background processing

How we built it

We designed Arlis as a hybrid serverless + worker architecture optimized for both real-time interaction and long-running autonomous execution.

The Brain: Gemini 3 Pro handles complex reasoning and multi-step planning, while Gemini Flash provides near-instant chat responses. Gemini Vision acts as the eyes for document analysis.

The Hands: A dual-provider browser system using Browserbase for stealth cloud sessions (to bypass anti-bot measures) and Stagehand AI for intelligent element detection.

The Memory: A sophisticated Long-Term Memory System that uses vector embeddings to remember user details across sessions. If you told Arlis your SSN for a SNAP application, it won't ask for it again when you apply for a DMV renewal.

Key Technical Innovations

1. Control Center Orchestrator Built a production-grade orchestration system that manages autonomous agent execution:

  • AgentOrchestrator: Manages all active agent sessions with lifecycle control (start, pause, resume, cancel)
  • AgentSession: Encapsulates individual execution contexts with state management
  • Queue Polling System: Worker polls database every 5 seconds for pending jobs, prioritized by urgency
  • Signal-Based Control: Real-time pause/resume/cancel via database signals
  • Graceful Cleanup: Ensures browser sessions and resources are properly released
  • Concurrent Execution: Handles multiple applications simultaneously without conflicts

2. Long-Term Memory System Built a sophisticated memory architecture that gives Arlis true context awareness:

  • Hybrid Search: Combines vector embeddings (semantic) + BM25 (keyword) + RRF (Reciprocal Rank Fusion) for optimal recall
  • Memory Categories: Organizes memories into preferences, facts, learnings, and mistakes
  • Confidence Scoring: Each memory has confidence and importance weights
  • Cross-Session Persistence: Remembers user details across applications (never ask for SSN twice)
  • Document Integration: Searches both explicit memories AND vault documents in one query
  • Automatic Embedding: Uses Gemini embeddings for semantic similarity matching
  • reflect Tool: Agent can explicitly learn (store) and recall (search) memories during execution

3. Modular Skills Architecture Designed a plugin-based skill system where each skill is a self-contained, reusable capability:

Skill Interface Design: Every skill extends the ArlisSkill abstract class with:

  • config: Name and description for AI tool registration
  • inputSchema: Zod schema for type-safe validation
  • execute(): Core implementation logic
  • getTools(): Optional method to expose as AI SDK tools

This architecture allows:

  • Composability: Skills can call other skills (e.g., policyReader uses pdfWizard)
  • Testability: Each skill is independently testable with mock inputs
  • Extensibility: New skills can be added without modifying core agent logic
  • Type Safety: Full TypeScript inference from input schema to output

4. Checkpoint-Based Recovery Implemented a sophisticated checkpoint system that:

  • Saves execution state after every major step
  • Enables pause/resume functionality
  • Allows recovery from failures without starting over
  • Provides audit trail for compliance

5. Real-Time Progress Broadcasting Supabase Realtime channels for live updates:

  • Agent heartbeat (every 5 seconds)
  • Progress updates (step-by-step)
  • Screenshot previews
  • Completion notifications

6. Memory-Augmented Context Long-term memory system that:

  • Remembers user preferences across sessions
  • Recalls facts from previous conversations
  • Eliminates redundant data entry
  • Builds comprehensive user profile over time

Challenges we ran into

1. The "Blind User" Problem

Challenge: Users had no visibility into what the agent was doing during autonomous execution. They'd submit a form and wait in darkness for 20 minutes.

Solution: Built a real-time progress broadcasting system using Supabase Realtime. Now users see:

  • Live step-by-step updates
  • Screenshot previews of the browser
  • Estimated time remaining
  • Ability to pause/resume execution

2. Browser Detection & Anti-Bot Measures

Challenge: Government portals use aggressive anti-bot detection (Cloudflare, reCAPTCHA, fingerprinting) that blocked our automation.

Solution:

  • Switched to Browserbase for cloud-based stealth browsers
  • Implemented anti-crawler measures (user agent rotation, viewport randomization)
  • Added human-like delays and mouse movements
  • Used Stagehand AI for intelligent element detection

3. Long-Running Execution on Serverless

Challenge: Vercel serverless functions have a 60-second timeout (300s on Pro), but complex forms take 10-20 minutes to complete.

Solution: Hybrid architecture:

  • Vercel: Handles user-facing chat and API routes
  • Worker Service: Persistent Node.js process on Cloud Instance for long-running tasks
  • Checkpoint System: Allows execution to pause and resume across multiple invocations

4. Context Window Management

Challenge: Government forms can reference 50+ page policy documents. Even with Gemini's 2M token context window, we needed smart context management.

Solution:

  • Implemented semantic chunking for large documents
  • Built a "Life Vault" that stores extracted data, not raw documents
  • Used embeddings for relevant section retrieval
  • Cached frequently accessed policy interpretations

5. Multi-Language Legal Translation

Challenge: Translating complex legal jargon (e.g., "adjusted gross income") into 35+ languages while maintaining legal accuracy.

Solution:

  • Used Gemini's native multilingual capabilities
  • Built a glossary of legal terms with verified translations
  • Implemented back-translation validation
  • Added human-in-the-loop review for critical fields

6. Vercel Cron Job Limitations

Challenge: Vercel Hobby plan only allows 1 cron job per day, but we needed to poll for pending applications every 15 minutes.

Solution: Deployed worker service to Cloud Instance:

  • Runs persistent Node.js process
  • Polls database every 15 minutes
  • Zero cost for production workload

7. Form Field Ambiguity

Challenge: Government forms often have ambiguous field labels (e.g., "Income" could mean gross, net, adjusted, annualized).

Solution:

  • Built a "Forensic Audit" skill that cross-references source documents
  • Implemented policy-aware field interpretation
  • Added confidence scores to every field value
  • Flags ambiguous fields for human review

Accomplishments that we're proud of

Technical Achievements

1. 220,000+ Lines of Production-Quality Code in 3 Weeks

  • 100% TypeScript with strict type safety
  • Modular architecture with clear separation of concerns
  • 300+ commits with atomic, well-documented changes
  • Zero technical debt—built for scale from day one

2. True Autonomous Execution

  • First AI agent that can complete 20-field government forms end-to-end without human intervention
  • Self-correcting error handling (retries, alternative strategies)
  • Checkpoint-based recovery system
  • Real-time progress transparency

3. Multi-Modal AI Integration

  • Seamlessly combines Vision (document analysis), Text (reasoning), and Audio (voice interface)
  • Handles crumpled letters and blurry scans that break traditional OCR
  • Extracts structured data from unstructured documents with 95%+ accuracy

4. Production-Ready Architecture

  • Hybrid serverless + worker design for optimal cost and performance
  • Dual browser provider system (cloud + local)
  • Real-time progress broadcasting
  • Comprehensive audit logging for compliance

Impact Achievements

1. Democratizing Access to Government Services

  • Built a system that gives every citizen the "bureaucracy entourage" only the wealthy can afford
  • Eliminates language barriers with 35+ language support
  • Makes government services accessible to people with disabilities

2. Solving the $60 Billion Problem

  • Directly addresses the unclaimed benefits crisis
  • Reduces application time from 20+ hours to 60 seconds
  • Prevents rejections due to missing documents or incorrect fields

3. Real-World Validation

  • Successfully tested on actual government forms (SNAP, DMV, IRS)
  • Handles complex conditional logic and multi-step workflows
  • Maintains legal compliance and audit trails

Innovation Achievements

1. "Life Vault" Concept

  • Revolutionary approach to identity document management
  • Secure, encrypted, centralized storage
  • Never hunt for a birth certificate again

2. "Jargon Shield" Translation

  • Transforms complex legal language into plain English in real-time
  • Context-aware explanations (e.g., "This is asking for Line 5 from your W-2")
  • Empowers users instead of confusing them

3. Proactive Advocacy

  • Arlis doesn't wait for you to ask—it monitors policy changes
  • Notifies you when you become eligible for new benefits
  • Pre-fills applications automatically

What we learned

Technical Learnings

1. Agentic AI is Ready for Production

  • Gemini 3 Pro's reasoning capabilities are genuinely impressive
  • Tool orchestration works reliably with proper error handling
  • The key is transparency—users trust agents they can see working

2. Browser Automation is Still Hard

  • Anti-bot measures are sophisticated and constantly evolving
  • Cloud browsers (Browserbase) are essential for stealth
  • Human-like behavior patterns matter more than perfect selectors

3. Serverless Has Limits

  • Long-running tasks need persistent workers
  • Hybrid architectures are the sweet spot

4. Real-Time UX is Critical

  • Users won't trust a "black box" agent
  • Live progress updates increase confidence dramatically
  • Screenshot previews are worth the bandwidth cost

Product Learnings

1. The Problem is Bigger Than We Thought

  • 162M Americans file taxes annually
  • 82M rely on Medicaid
  • 42M depend on SNAP
  • 33M small businesses face compliance costs
  • Nearly everyone interacts with government bureaucracy

2. It's Not About the Form—It's About the Outcome

  • Users don't want to "fill forms faster"
  • They want to "never worry about bureaucracy again"
  • The real value is peace of mind, not speed

3. Accessibility is a Superpower

  • Multi-language support isn't a feature—it's a moral imperative
  • Voice interface opens doors for elderly and disabled users
  • Plain language translation levels the playing field

Execution Learnings

1. Build for Scale from Day One

  • Modular architecture pays off immediately
  • TypeScript strict mode catches bugs before they ship
  • Comprehensive logging is essential for debugging agentic systems

2. User Trust is Earned, Not Assumed

  • Transparency builds trust (show the work)
  • Audit trails provide accountability
  • Human-in-the-loop for critical decisions

3. Hackathons Demand Ruthless Prioritization

  • Focus on the "visible magic" first
  • Polish the demo flow obsessively
  • Technical debt is acceptable if it ships the vision

What's next for Arlis AI - The Bureaucracy Assassin

Immediate Roadmap (Next 3 Months)

1. Expand Portal Coverage

  • Federal: IRS, SSA, USCIS, VA, HUD, Medicare
  • State: DMV (all 50 states), unemployment offices, state benefits
  • Local: Business licenses, building permits, court filings

2. Voice Agent for Phone Calls

  • Integrate Gemini Audio for real-time voice conversations
  • Call government agencies and navigate phone trees
  • Wait on hold so users don't have to
  • Patch users in when human interaction is required

3. Mobile App

  • Native iOS/Android apps for document capture
  • Push notifications for status updates
  • Offline mode for document storage
  • Biometric authentication for security

4. Enterprise Features

  • Multi-user accounts for families
  • Caseworker dashboard for social workers
  • Bulk processing for non-profits
  • White-label solution for government agencies

Long-Term Vision (6-12 Months)

1. Proactive Advocacy Engine

  • Monitor policy changes across all government levels
  • Automatically identify new benefits users qualify for
  • Pre-fill applications and notify users
  • Track deadlines and send reminders

2. "Forensic Auditor" for Government Math

  • Verify benefit calculations for accuracy
  • Identify underpayments and fight for corrections
  • Detect fraud or errors in government systems
  • Generate appeals automatically

3. Community Knowledge Graph

  • Anonymized success patterns across users
  • "Users like you succeeded by doing X"
  • Crowdsourced portal navigation strategies
  • Real-time portal status (e.g., "IRS portal is down")

4. Integration Ecosystem

  • Connect to tax software (TurboTax, H&R Block)
  • Sync with HR systems (ADP, Gusto)
  • Import from banks (Plaid integration)
  • Export to legal services (LegalZoom, Rocket Lawyer)

🌍 Moonshot Goals (12+ Months)

1. Government Partnership Program

  • Work with agencies to simplify forms at the source
  • Provide accessibility overlays for government websites
  • Offer Arlis as a free service for low-income citizens
  • Reduce administrative burden on government staff

2. International Expansion

  • Start with Canada, UK, Australia (English-speaking)
  • Expand to EU (GDPR compliance)
  • Adapt to developing nations (mobile-first)
  • Partner with NGOs for refugee/immigrant support

3. Policy Advocacy

  • Use aggregated data to identify systemic issues
  • Advocate for form simplification
  • Push for standardized data formats
  • Champion digital-first government services

4. The "Bureaucracy Entourage" Platform

  • Expand beyond government to insurance, healthcare, legal
  • Become the universal advocate for complex paperwork
  • Partner with lawyers, accountants, social workers
  • Build the "operating system for life admin"

Closing Statement

Arlis isn't just a hackathon project—it's a movement to democratize access to government services.

We've proven that AI can be more than a chatbot. It can be a tireless advocate that fights bureaucracy so you don't have to.

The $60 billion problem is solvable. The technology exists. The time is now.

Join us in assassinating bureaucracy.


Built with Gemini 3 Jan 21 - Feb 9, 2026 | 3-Week Sprint | 220k+ Lines of Code

Built With

Share this project:

Updates