Building your own agent is hard, especially if you don't have the computing power or experience to fine-tune a model. Even with vibe coding, AI tools often struggle to identify logical errors and pinpoint their root cause. We wanted to create a tool that allows an agent to fix itself using real-time data, ensuring it does what it's supposed to.

Retrace AI detects logical errors in your agent, retraces its steps, pulls real-time data, verifies the information it collects, and automatically optimizes its code and workflows to ensure reliable performance.

We implemented an orchestrator that monitors the agent. When it detects logical errors, it triggers a multi-step system. It generates scenario-based test data to feed into the agent, calculating baseline accuracy from its outputs.

This is followed by a self-healing process with two triggers: code optimization and workflow optimization.

We use a Yutori deep research agent to pull real-time documentation on the model and codebase, which helps optimize code. We also use Tavily search to optimize operational logic and decision pathways to resolve simpler issues quickly without requiring deep research.

All collected information is saved locally and run through Senso when the agent makes changes. This minimizes hallucinations and ensures fixes are aligned with verified research.

Each change creates a node, and the entire process is mapped using Neo4j to track workflow evolution — showing how the agent's accuracy improves over time and whether improvements came from code fixes or decision optimization.

Integrating Yutori and Tavily into the agent’s logic was difficult at first, as the system tended to hallucinate and ignore research findings. Once we better understood how both services worked, the integration became much more effective.

Handling edge cases was also challenging, but we addressed this by implementing model fallbacks.

We were especially proud when we first saw measurable improvement in the agent’s performance. The addition of deep research and real-time search made a significant impact, as live data proved far more effective at resolving logical errors than relying solely on an LLM.

Successfully integrating Senso was also a major milestone, as it was a new tool for us.

This project taught us that real-time data is critical when debugging logical errors, especially because documentation constantly evolves and new models are released frequently.

For our future vision, we want Retrace to move from reactive debugging to proactive optimization.

Instead of waiting for failures, Retrace will continuously simulate edge cases and stress-test agents before deployment, identifying logical weaknesses before they appear in production.

We’re also expanding from single-agent monitoring to full multi-agent system alignment, allowing Retrace to detect coordination breakdowns between agents, not just individual mistakes.

Long term, our goal is to make self-healing a native layer of AI systems — an always-on reliability engine that keeps agents aligned with real-world constraints, evolving documentation, and changing objectives over time.

Built With

Share this project:

Updates