Inspiration

Working with Aurora DSQL's Optimistic Concurrency Control (OCC) presents a unique challenge: you can't test conflict handling in local development. Conflicts only occur when multiple transactions modify the same row simultaneously - something that rarely happens on a developer's laptop but is inevitable in production.

We kept asking ourselves: How do you validate retry logic you can't trigger? How do you test backoff strategies you can't observe?

The documentation explains OCC concepts well, but reading about Error 40001 is very different from watching your code handle 50 concurrent transactions fighting over the same row.

ZeroLock-Studio was born from this gap - the need for a real testing environment where developers can stress-test their DSQL transaction code before it hits production.

What it does

ZeroLock-Studio is an interactive testing environment for Aurora DSQL that lets developers:

  • Stress Test with Chaos Engineering - Inject conflicts, add latency, simulate 50+ concurrent threads hitting the same rows
  • Watch Real-Time Telemetry - See conflicts, retries, and latency percentiles (P50/P95/P99) update live via Server-Sent Events
  • Validate Retry Logic - Backoff heatmaps show if your exponential backoff + jitter is actually spreading retries correctly
  • Get AI-Powered Analysis - Ask the AI assistant to analyze test results, explain conflicts, or design safe transaction patterns
  • Auto-Discover Schema - Connect to DSQL and automatically detect tables, columns, and potential hotspots (like sequential primary keys)

All running against a real Aurora DSQL cluster - not a simulation.

How we built it

Frontend:

  • Next.js 14 with React 19 and TypeScript
  • Monaco Editor for code editing
  • Recharts for real-time metrics visualization
  • React Flow for the visual transaction builder
  • Zustand for state management
  • Tailwind CSS for styling

Backend:

  • Next.js API routes with Server-Sent Events (SSE) for real-time streaming
  • AWS SDK for DSQL authentication (IAM-based token signing)
  • PostgreSQL driver (pg) for DSQL connections
  • OpenAI GPT-4 for the AI assistant

Database:

  • Amazon Aurora DSQL (eu-north-1 region)
  • Optimistic Concurrency Control with full ACID compliance

Key Technical Decisions:

  • SSE over WebSockets for simpler deployment on Vercel's serverless infrastructure
  • Zustand over Redux for lightweight, focused state management
  • Real DSQL execution (not mocked) to ensure authentic conflict behavior

Challenges we ran into

  1. SSE on Serverless - Vercel's serverless functions have timeout limits. We optimized execution batching to complete within limits while still providing meaningful stress tests.

  2. Triggering Real Conflicts - OCC conflicts require precise timing. We built a conflict injection system that intentionally overlaps transaction windows to guarantee conflicts occur predictably.

  3. Visualizing Backoff Distribution - Showing whether retry delays follow proper exponential backoff with jitter required building a custom heatmap that aggregates timing data across all retries.

  4. AI Context Management - The AI assistant needs both code context AND telemetry context. We built a system that passes current metrics alongside code for contextual recommendations.

Accomplishments that we're proud of

  • Real-time streaming that actually works - Watching conflicts tick up live as DSQL rejects transactions is incredibly satisfying
  • The "aha moment" - When you see P95 latency spike while P50 stays flat, you immediately understand tail latency from retries
  • Backoff validation - The heatmap instantly reveals if your retry logic is broken (clustered) or correct (distributed)
  • Zero simulation - Every conflict, every retry, every latency measurement comes from real Aurora DSQL execution

What we learned

  • Aurora DSQL's OCC is remarkably efficient - even under extreme contention, transactions that don't conflict succeed immediately
  • Exponential backoff without jitter causes "thundering herd" - seeing this visualized made the concept click
  • P99 latency is where the pain hides - it's easy to ignore until you see a 500ms P99 next to a 30ms P50
  • Real testing beats documentation - 5 minutes with ZeroLock-Studio taught us more than hours of reading

What's next for ZeroLock-Studio

  • Pattern Library - Pre-built transaction patterns (transfers, counters, batch inserts) with proven retry strategies
  • Regression Testing - Save test configurations and run them automatically on code changes
  • Team Sharing - Share test results and code snippets with teammates
  • More Databases - Extend the chaos engineering approach to DynamoDB and other AWS databases

Built With

Share this project:

Updates