LedgerLive: Real-Time Financial Close with Gemini Agents

The Problem: Month-End Close is Broken

Every company with more than 10 employees runs a month-end financial close. Finance teams spend 10 days every month on manual reconciliation work — matching invoices to GL entries, triaging exceptions, chasing approvals, and posting journal entries. During this window, errors cost companies an average of $300,000 per incident. The work is repetitive, high-stakes, and exactly the kind of task that should be automated — but existing finance tools just surface data. They don't act on it.

The Inspiration

We wanted to build an agent that doesn't just show a CFO their exceptions — it resolves them. The insight was that most month-end exceptions fall into two categories: low-risk items (timing differences, rounding) that can be resolved automatically with a journal entry, and high-risk items (duplicate payments, fraud flags, missing documents) that genuinely need a human decision. The split is predictable. The routing is automatable. The human should only touch what actually requires judgment.

Airia's platform made this vision buildable in a day. The combination of multi-step agent orchestration, native Human-in-the-Loop approval with email notification, and the ability to call external APIs as first-class agent steps gave us everything we needed.

What We Built

LedgerLive is an Active Agent on Airia that autonomously orchestrates the month-end financial close process across two live systems.

The agent runs a continuous Perceive → Decide → Act loop:

PERCEIVE — Two parallel HTTP steps fetch live exception data and KPI metrics from a FastAPI backend deployed on Railway. Real data, every run.

DECIDE — GPT 4.1 (via Airia) analyzes every exception and classifies it:

LOW_RISK: variance < $10,000 AND type is timing_difference or rounding → auto-resolve
HIGH_RISK: variance ≥ $10,000 OR type is fraud_flag, missing_document, duplicate_payment → escalate

A Python-based Conditional Branch routes the flow based on what GPT 4.1 returns.

ACT — Low-risk exceptions get journal entries generated and posted automatically back to the system via HTTP POST. No human needed.

HITL — High-risk exceptions trigger Airia's Human Approval workflow. The CFO receives an email with full exception context, risk classification, and the proposed resolution — and approves or denies. The agent waits, then continues based on the decision.

The Architecture

Input
  ↓
[FetchExceptions (GET /api/exceptions)] + [FetchKPIs (GET /api/kpi)]  ← parallel
  ↓
CloseOrchestrator (GPT 4.1 via Airia)
  ↓
RiskRouter (Python Conditional Branch)
  ↓
[HIGH RISK → CFOReviewGate (Human Approval) → PostJournalEntry]
[LOW RISK  → PostJournalEntry directly]
  ↓
Output

The frontend is React 18 + Vite + Tailwind with a dark F1 racing theme — because a month-end close is like a pit stop: every second counts, every person has a role, and real-time telemetry keeps everyone aligned.

What We Learned

Building on Airia taught us that the hardest part of multi-agent systems isn't the AI — it's the routing logic between steps. Conditional branching based on structured JSON output from a model requires careful contract design between the prompt and the downstream Python. We iterated through several versions of the system prompt before GPT 4.1 consistently returned the exact JSON shape the conditional branch expected.

We also learned that Human-in-the-Loop is not a fallback — it's the feature. The most impressive moment in the demo isn't the auto-resolve; it's the pause. The agent stops, sends an email, and waits for a human decision before proceeding. That moment is when the system feels genuinely enterprise-grade rather than a toy.

Challenges

Airia's step output format. The Python conditional branch receives step outputs in a specific internal format — not a plain dictionary. We spent significant time debugging the key paths (steps.get("CloseOrchestrator") vs step ID lookup) before landing on a robust string-parsing approach that handles every possible output shape.

Structured Output schema validation. GPT 4.1 rejected the auto-generated JSON schema from Airia's Structured Output toggle due to an incompatible name keyword. Disabling Structured Output and enforcing the schema through the prompt directly solved this cleanly.

Connecting two live systems. The Airia agent calls the LedgerLive FastAPI backend to fetch data, and the FastAPI backend calls the Airia agent to process it. Getting the bidirectional integration working on Railway — with correct CORS, env vars, and timeout handling — required careful deployment sequencing.