Inspiration

Modern backend systems rely heavily on internal APIs that evolve frequently. Every change—like a date format shift or field rename—can silently break downstream services. We wanted to build a system that not only catches these failures in real time but can resync itself automatically without manual intervention.

What it does

Resync is a self-healing infrastructure tool that monitors backend API calls, detects breaking changes, and uses an AI agent to automatically rewrite and redeploy the affected code. When an internal API shifts behavior, Resync identifies the failure, updates the request logic, and restores full functionality—no human needed.

How we built it

  • MCP Server: A Node.js/Express service making outbound API calls to a mock internal API.
  • Mock Internal API: Simulates changing requirements (e.g., date format) to induce breakages.
  • Phoenix: Used for real-time observability and tracing failed calls.
  • Toolhouse: Hosts an AI agent that monitors logs, analyzes errors, edits the code, and pushes fixes to GitHub.
  • CI/CD Pipeline: Automatically deploys changes to Fly.io when new commits are made.
  • Fly.io: Hosts the MCP server, providing instant redeploys.

Challenges we ran into

  • Designing a clear interface for the AI agent to understand and modify code safely.
  • Managing error parsing and determining which failures were due to code issues vs. transient network errors.
  • Making the feedback loop short and robust enough to trigger a deploy without false positives.

Accomplishments that we're proud of

  • Demonstrated a real, working loop where an API failure is detected, fixed, and redeployed—all without human intervention.
  • Created a flexible framework that could be applied to many backend environments beyond just the demo.
  • Made the AI agent smart enough to interpret error messages and find the correct patch without explicit instructions.

What we learned

  • API failures are rarely isolated—good observability is key to identifying root causes.
  • AI can be a powerful tool for infrastructure resilience if it's guided by real production signals.
  • Automating deployment is as critical as detecting the issue—half the value is in removing manual bottlenecks.

What's next for Resync

  • Integrate support for more observability platforms like Datadog and Sentry.
  • Expand the AI agent's capabilities to handle more complex code changes and multi-file dependencies.
  • Build a visual dashboard to show the history of errors and automated fixes.
  • Extend to frontend/backend integrations (e.g., auto-fixing React apps when APIs change).

Built With

Share this project:

Updates