Inspiration
Agent payment frameworks now hand AI agents real wallets, but they cap spending per session only. A fleet of agents can each stay inside its own session limit and still drain a company budget overnight, and afterward no one can answer the two questions finance actually asks: how much did our agents spend, and on exactly what? The per-session limit is the easy part; the thing nobody built is the org-wide, cross-region, audit-grade ledger that makes the limit unbreakable and every dollar traceable. That ledger is Stub.
What it does
Stub is a strongly-consistent, double-entry spend ledger that sits between a company's AI agents and the money they spend (x402 micropayments, paid APIs, LLM tokens). It enforces one company-wide budget that cannot be overspent and produces an immutable, queryable audit trail.
- One budget across the whole fleet: caps cascade org → team → agent; each spend is bound by the tightest cap up the hierarchy, enforced inside a single database transaction.
- A spend that would breach the budget fails the database transaction. Under concurrent
cross-region writes, Aurora DSQL's optimistic concurrency control returns
SQLSTATE 40001; Stub retries against the fresh balance and either commits or records a denial. The balance never goes negative. There is no overspend window. - Audit-grade books: every spend is an immutable, hash-chained double-entry line with the full payment receipt captured as JSON and attributed to a user, intent, agent, and session.
- Policy + safety: per-transaction and rolling-window caps, vendor allow/blocklists, approval thresholds, a velocity circuit-breaker that auto-freezes runaway agents, and a fleet-wide kill-switch.
- Answers in plain English: ask "how much did marketing's agents spend on data APIs?" and the model fills a constrained, parameterized query over the ledger (never raw SQL).
How we built it
A pure, dependency-free domain core (core/) implements the double-entry ledger, the hierarchy
walk, policy evaluation, and the hash chain. A thin Aurora DSQL adapter (db/) implements a generic
Store interface over the cluster using the pg driver and the Aurora DSQL Node connector (which
generates IAM auth tokens automatically). The same spend() logic runs against an in-memory OCC
model for offline tests and against live DSQL in production. Swap the adapter, the core is
untouched. The frontend is a Next.js + Tailwind mission-control dashboard deployed on Vercel; the
natural-language query uses OpenAI function-calling with a deterministic offline parser as fallback.
Why Aurora DSQL (the technical heart)
Budget-enforcement correctness is the database's consistency model, so the database is unswappable by design.
- The load-bearing property is active-active, multi-region strong consistency. A writer in
us-east-1and a writer inus-east-2hitting the same balance resolve to one consistent outcome. No other AWS database offers this: Aurora PostgreSQL Global is single-writer, and DynamoDB global tables are eventually consistent (last-writer-wins → silent overspend during the replication window). - DSQL has no pessimistic row locks (
SELECT … FOR UPDATE). The naive port of a spend gate (lock the balance row, check, write) simply cannot be written. The invariant instead rests on a contended mutable balance row: every spend rolls its ancestors' balances up in the same transaction, so concurrent writers touch a shared row and OCC forces the loser to conflict (40001). A purely append-only design would not be safe under snapshot isolation: two racing inserts touch different rows, neither conflicts, and the budget goes negative. Getting this right is the core engineering insight of the project. - We use GA features only for the core (strong consistency, OCC/snapshot isolation, ACID transactions, JSON receipt column, the Node connector with automatic IAM tokens). Preview features (CDC → Kinesis for a streaming dashboard) are layered and degrade gracefully to application-level polling.
We prove the invariant on the real cluster: six agents across us-east-1 and us-east-2 race one
near-empty budget; exactly what fits commits, the rest deny with real OCC 40001s, the balance is
floored at zero, the double-entry sum is zero, and the hash chain verifies.
What we learned
The whole project turned on one fact about Aurora DSQL: it has no pessimistic row locks. The
textbook spend gate (SELECT … FOR UPDATE, check, write) cannot even be written. Worse, the "clean"
append-only ledger is unsafe under snapshot isolation: two concurrent spends insert different rows,
OCC sees no conflict, both commit, and the budget goes negative. The fix is counterintuitive: keep a
mutable balance row that every spend updates in the same transaction, so racing writers collide on
one row and OCC forces the loser to a 40001 serialization failure. We learned to treat 40001 as
normal control flow rather than an error: retry against the fresh balance, then deny if there is
genuinely no budget left. We also learned to keep the core on GA features only, with preview
features (CDC to Kinesis) as a degradable layer so the guarantee never depends on preview.
Challenges we ran into
Designing the ledger so the overspend guarantee survives DSQL's lock-free OCC model (reconciling an "immutable, append-only" audit log with the need for a contended serialization point) was the hard part, and the one most easy to get subtly wrong.
Accomplishments we're proud of
A budget that the database itself refuses to let you break, proven live across two regions, on top of an audit trail that can answer a CFO's question in plain English.
What's next for Stub
Native CDC → Kinesis streaming for the live dashboard and anomaly breaker, accounting exports (QuickBooks/NetSuite), RBAC and multi-tenancy, and adapters for additional payment rails (Stripe, AP2) so one budget spans every rail.
Built With
- agentcore
- amazon-aurora-dsql
- amazon-web-services
- aurora
- dsql
- next.js
- node.js
- npm
- openai
- pg
- postgresql
- react
- tailwind-css
- typescript
- vercel
- x402
Log in or sign up for Devpost to join the conversation.