Inspiration
By 2026 agentic commerce shipped - ChatGPT Instant Checkout, Mastercard Agent Pay, Visa Intelligent Commerce, Stripe's machine-payments work. Everyone is racing to solve agent identity and payment credentials. Almost nobody is solving the load-bearing problem underneath: merchant backends were never built for software that buys at machine speed — concurrent, retry-heavy, and multi-region. At that speed they oversell scarce stock, double-charge on timeouts-and-retries, and silently blow past spend limits.
We wanted to build the correctness layer the agentic-commerce wave actually runs on — and to build it on the one database whose model is genuinely made for it: Amazon Aurora DSQL.
What it does
ZeroRace is a checkout kernel: one idempotent ACID transaction that, atomically, claims inventory, debits a spend mandate, writes a balanced double-entry ledger, and emits a transactional outbox event. Under a 1,000,000-agent swarm against 100 units it sells exactly 100, and four invariants hold — each provable with a single SQL query, not a dashboard counter:
$$\text{oversells} = \text{duplicate settlements} = \text{ledger drift} = \text{mandate breaches} = 0$$
No oversell is structural. Stock is sharded across random-keyed buckets and claimed with a conditional decrement the database refuses to take negative, with CHECK (available_qty >= 0), so unit 101 of 100 is impossible. Ledger drift is provably zero because every commit writes two signed legs that sum to nothing:
$$\sum_{i} \text{signed amount}_i = 0$$
And it isn't only a demo — it's a real, self-serve product: sign up, get an API key, POST /v1/purchases/commit, with a tenant-scoped console to create your own merchants, mandates, and inventory. You can drive the kernel yourself in a live playground and watch which constraint binds while the invariants stay at zero.
How we built it
- Truth core — Aurora DSQL. PostgreSQL-16-compatible, optimistic concurrency (OCC) with snapshot isolation. The kernel uses raw
node-postgresfor explicit BEGIN/COMMIT/ROLLBACK and conditionalUPDATE … WHEREwith rowCount checks — the row-count is the no-oversell / mandate logic. - Sharded everything. Inventory pre-split across 64 random buckets; the spend mandate split across 20 budget buckets. No hot row anywhere, which is exactly what DSQL asks for.
- Idempotent + double-entry. Keyed on (merchant_id, idempotency_key) under a UNIQUE index; every commit writes a MANDATE_RESERVED -amount and MERCHANT_PAYABLE +amount ledger pair.
- Read plane — DynamoDB. A transactional outbox row is written in the same DSQL txn; a separate projector ships it to DynamoDB exactly-once, feeding the live Mission Control dashboard over SSE.
- Multi-region. A real peered cluster — Tokyo ap-northeast-1 + Seoul ap-northeast-2, witness Osaka — two strongly-consistent write endpoints with app-layer failover.
- Deployed on Vercel via the AWS Marketplace integration using OIDC federation — no static credentials; DSQL IAM auth tokens are minted from a Vercel-issued OIDC token at runtime.
- A real agent. An MCP server + the Anthropic SDK (claude-opus-4-8) shops a limited drop and is declined at the 9th buy by the same kernel.
- The product layer. tenants + api_keys, Bearer auth, tenant-scoped /v1 resource APIs, a Next.js console, and a self-resetting public sandbox — all additive on the same schema.
Challenges we ran into
- The budget hot row. First honest 10k run: 100 sold, but only 2 commits/sec with 303 RETRY_EXHAUSTED. We'd sharded inventory but left every winning commit decrementing one mandate row — under OCC those serialize and conflict-storm. Fix: shard the budget too. Result: 0 exhausted, 0 errors, ~390 OCC retries (surfaced, not hidden).
- DSQL has no TRUNCATE and a per-transaction row limit. Re-seeding between runs blew the cap on a single DELETE. Fix: batched deletes under the limit.
- A pool-exhaustion deadlock our own adversarial review caught — the kernel re-read the idempotency registry on a second pooled connection while holding the first; under same-key contention every slot waited on a connection that never freed. Fixed by reusing the held client.
- An OIDC STS DNS storm (getaddrinfo EBUSY) when a burst opened many connections that each resolved credentials at once. Fixed by memoizing the OIDC credentials process-wide.
- Async indexes are a correctness gate. The UNIQUE idempotency indexes are what make dupes provably zero; DSQL builds indexes asynchronously, so they must be ACTIVE before traffic.
- No FOREIGN KEYs. Referential integrity and tenant isolation (403 cross-tenant) live in the service layer plus periodic audit queries.
Accomplishments that we're proud of
- pnpm prove — every invariant reduces to one read-only query over the live cluster, returning oversells / duplicate_settlements / ledger_drift / mandate_breaches = 0. Correctness proven, not claimed — a judge can clone and run it.
- Exactly 100 sold under 1,000,000 agents (0 errors, 585 bounded OCC retries), and a scaling sweep that gets faster under higher concurrency as the sold-out attempts drain in parallel.
- Real multi-region failover mid-swarm — Tokyo to Seoul split 10/90, total exactly 100, 0 oversold, both regions reconciled into one consistent ledger.
- A real Claude agent transacting safely and being declined atomically at its budget edge.
- We turned the kernel into a real, self-serve, multi-tenant product — signup, API keys, console, playground — entirely within the submission window.
What we learned
DSQL doesn't let you paper over contention with locks; it makes you design the contention out. "Don't concentrate writes on a single key" turned out to be the entire architecture — sharded inventory and sharded budget, idempotent retry, double-entry ledger, async-built UNIQUE indexes. We learned to treat correctness as a SELECT, and that strong consistency across active-active regions is a capability you cannot fake with read replicas — the honest answer to "why not single-writer Postgres?"
What's next for ZeroRace
DynamoDB Streams to Lambda for true push real-time; AWS FIS region-impairment chaos in the failover demo; adapters for real payment rails (Stripe / Visa / Mastercard agent APIs); a published SDK; and production multi-tenant hardening — API-key rotation, per-tenant rate limits, and metered billing.
Built With
- amazon
- amazon-web-services
- anthropic
- aurora
- claude
- drizzle
- dsql
- dynamodb
- events
- next.js
- node-postgres
- node.js
- oidc
- orm
- react
- server-sent
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.