Overview
Redundant is a trace level profiler and cost firewall for multi agent AI workflows.
Inspiration
Multi agent runs are becoming increasingly popular. We noticed they get expensive fast, with no easy to digest insights into why. A single research assistant can make dozens of tool and LLM calls, many of which are, well, redundant! We were inspired to make a tool beyond prompt caching that could catch issues like runaway loops, tool repetition, and multi agent redundancy and provide fixes that save users real money.
What it does
Redundant sees the entire execution trace, finds waste that call-level tools can't see, pins a cost to each finding, and routes it to a fix or alert. There are two main detectors, one for catching repetitive tool calls and one for cycles, that flags agent loops. The detectors are priced with a per model token cost, such as, $X wasted of $Y total with a percentage visible on the UI to account for model pricing differences. Read only redundancy is forked to LangCache which is gated so we never serve an unsafe or stale result. High count or side effect repetition fires a Sentry alert on the dashboard since this suggests a reliability incident. The dashboard UI renders the run as a flamegraph, with issues in red, and prompts fixes such as a re-run with duplicated calls from the cache instead so you can see the cost drop in real time.
How we built it
The backend is Python and FastAPI. Redis is integrated as the trace bus (Streams) and the cache (LangCache). Detection uses simple frequency grouping plus networkx for cycle detection. The dashboard is React + Vite with a custom SVG flamegraph.
Band generates the demo trace, emitting one span per call on Redis Streams. The ingestion layer reads the stream, normalizes spans into an in-memory tree and call graph, and feeds the detectors. Findings flow out of a single FastAPI contract that both the UI and the remediation router consume: cacheable findings round-trip through LangCache, runaway findings go to Sentry via the SDK.
Challenges we ran into
We ran into challenges wiring up the Band agents, Redis Vectorsearch, and the front end together. We had to coordinate closely as a team to get all of our working parts together.
Accomplishments that we're proud of
We are proud of making a working end-to-end product with several integrations within the short hackathon timeframe. Redundant actually traces multi-agent trace flows from Band through Redis Streams into our detectors, gets priced in real dollars on real spans, and routes itself to the right place. The most rewarding part for us was getting the fixes working and watching one go to LangCache while the other fires a Sentry alert in real time.
What we learned
We learned the value of stepping back from call level AI cost tools and considering the issues that come up when you trace the run as a whole. We also learned a lot about Redis tools and when caching is safe versus unsafe with a read-only allowlist.
What's next for Redundant
We had some reach goals that we didn't quite have time for:
- A compression path for calls that are unsafe to cache but have bloated prompts, plus prompt-cache layout hints for the underlying model.
- A labeled evaluation set (via Terac) to validate the verifier's reuse decisions on held-out call pairs raw similarity versus verifier-gated reuse, with the numbers to back it up.
Built With
- fastapi
- javascript
- microsoft-band
- networkx
- openai-api
- pydantic
- python
- react
- redis
- redis-langcache
- redis-streams
- sentry
- sse-starlette
- typescript
- uvicorn
- vite
Log in or sign up for Devpost to join the conversation.