Return Loop Project Story Inspiration It started with a hoodie.
A friend ordered a hoodie online, wore it once, decided it didn't fit right, and shipped it back. Two weeks later, she ordered the exact same hoodie in a different size from the same brand. Meanwhile, the original hoodie had traveled 1,400 miles to a warehouse in Ohio, sat in a bin for three weeks, and eventually got relisted at a 40% discount.
The brand paid to ship it out. Paid to ship it back. Paid someone to inspect and restock it. Lost margin on the resale. And entirely missed the fact that the same customer still wanted the product just in a different size.
This isn't an edge case. This is ecommerce in 2025.
$$\text{US return volume} \approx $890\text{B/year} \quad \Rightarrow \quad \approx 16.9% \text{ of all retail sales}$$
For fashion, that number climbs above 30%. For electronics, it's close to 20%. And with the rise of "bracket buying" where customers order multiple sizes intending to return most the problem is accelerating faster than any warehouse can absorb.
We built Return Loop because the entire returns industry is optimized around processing returns efficiently, when the real opportunity is preventing them intelligently.
What We Built Return Loop is a multi-agent AI platform that intercepts the returns pipeline at every stage before a return is requested, during the negotiation, and after it's accepted to maximize value recovery for ecommerce merchants.
The system runs five specialized AI agents, each owning a distinct phase of the return lifecycle:
Agent Role Prophet Predicts returns before they're initiated; sends proactive interventions Whisperer Calls customers via voice AI to negotiate better outcomes Loop Matcher Reroutes accepted returns directly to nearby pending orders Recoverer Maximizes value on non-routable returns (refurbish, donate, liquidate) Learner Aggregates patterns across all returns to surface actionable brand insights The platform integrates with Shopify via webhooks, syncs order and customer data through Airbyte, uses Bland AI for outbound voice calls, and exposes a real-time dashboard with live agent trace visualization over WebSockets.
How We Built It Backend: FastAPI + Claude + Multi-Agent Orchestration The backbone is a Python FastAPI service with an event-driven orchestration layer. When a return is initiated, an event fires on a central bus and agents subscribe to relevant events in sequence:
RETURN_INITIATED → Prophet (pre-return prevention) → Whisperer (voice negotiation) → NEGOTIATION_COMPLETE → Loop Matcher (geospatial rerouting) → Recoverer (value recovery if no match) → Learner (pattern aggregation) Each agent is powered by Claude (claude-sonnet-4-5) with carefully constructed system prompts that encode domain logic customer LTV weighting, risk scoring, environmental cost models so the LLM reasons within real business constraints, not just raw instructions.
Every agent decision is written to an AgentTrace table with full reasoning, confidence scores, and timing then broadcast to the frontend via WebSocket for live visualization.
Geospatial Matching: The Loop The Loop Matcher is where the system earns its name. For every accepted return, it queries Aerospike (an in-memory database optimized for geospatial workloads) for pending orders of the same SKU and size within a configurable radius.
Distance is calculated using the Haversine formula:
$$d = 2r \arcsin\left(\sqrt{\sin^2!\left(\frac{\phi_2 - \phi_1}{2}\right) + \cos\phi_1 \cos\phi_2 \sin^2!\left(\frac{\lambda_2 - \lambda_1}{2}\right)}\right)$$
where $\phi$ is latitude, $\lambda$ is longitude, and $r = 6{,}371\text{ km}$.
The savings are then computed against the counterfactual: what would it have cost to route through the central warehouse?
$$\text{Cost Saved} = C_{\text{warehouse route}} - C_{\text{direct route}}$$
$$C = C_{\text{base}} + r_{\text{per-mile}} \cdot d \quad \text{where } r_{\text{per-mile}} \approx $0.035$$
$$\text{CO}2\text{ Avoided (kg)} = 0.06 \cdot \Delta d{\text{miles}}$$
The candidate list is then passed to Claude for final routing decisions, weighted by distance, cost delta, CO₂ impact, and recipient risk score so we don't reroute to customers with a history of high return rates themselves.
Voice Negotiation: The Whisperer The Whisperer agent uses Bland AI to place real outbound calls. The call script adapts dynamically based on customer LTV high-value customers get more aggressive retention offers (partial refunds, discounts to keep) while lower-LTV customers are guided toward exchanges or store credit.
Outcomes are classified into one of five categories:
exchange customer wants a different variant keep_with_partial_refund customer keeps the item for a discount keep_with_discount smaller incentive, full keep store_credit customer accepts credit instead of cash refund full_return return proceeds The voice transcript is parsed and broadcast live to the dashboard, giving merchants a window into exactly how their customers are talking about their products.
Frontend: React + Mapbox + WebSocket The dashboard is built in React 19 with Tailwind CSS and Mapbox GL for route visualization. A custom useWebSocket hook maintains a persistent connection to the backend and dispatches incoming agent events into component state in real time.
The route map renders two overlapping layers per rerouted return:
A green arc for the actual direct reroute path A red dashed arc for the avoided warehouse route Savings badges float over each route segment, making the impact immediately tangible.
Challenges We Faced
- Orchestrating agents without race conditions
Multi-agent pipelines are deceptively hard to sequence. Early versions had the Whisperer and Loop Matcher firing simultaneously which meant we'd sometimes start rerouting a return that the Whisperer was still negotiating down. We rebuilt the orchestrator around a strict event-gated model: each agent can only fire after the preceding event is emitted, with explicit state transitions on the ReturnRequest model enforcing the pipeline order.
- Making LLM reasoning deterministic enough to act on
Claude is excellent at nuanced reasoning but we needed structured, actionable outputs not essays. We spent significant time iterating on system prompts to get consistent JSON-formatted decisions with confidence scores, while still preserving the richness of the reasoning that makes the Learner agent's insights genuinely useful. The trick was treating the LLM as a reasoner that populates a schema, not a generator that outputs freeform text.
- Geospatial matching at query time
Performing radius-based SKU matching on PostgreSQL with naive queries was too slow for real-time use. We added Aerospike as an in-memory geospatial index layer orders are written to both databases on creation, and the Loop Matcher queries Aerospike exclusively for neighbor lookups before handing off to Postgres for the full order details.
- Voice call state management
Bland AI operates asynchronously you initiate a call, and results come back via webhook. Bridging that async gap into our synchronous agent pipeline required a polling fallback with timeout handling and a call context map to correlate completed calls back to their originating return requests. Getting this right without leaking state across concurrent returns took several iterations.
What We Learned The most valuable AI is the AI that prevents work, not just processes it. The Prophet agent which prevents returns from being initiated at all has the highest ROI of any component, and it's also the simplest. Proactive is almost always cheaper than reactive.
Voice AI is ready for production use cases. We were skeptical that customers would engage meaningfully with an AI negotiation call. The transcripts surprised us customers respond to the framing of the conversation more than the channel it comes through.
Multi-agent systems need explicit contracts between agents, not just loose coupling. The event bus pattern forced us to define exactly what each agent produces and consumes, which made the whole system dramatically easier to reason about and debug.
Environmental metrics are a competitive differentiator, not just a feel-good dashboard. Early user feedback consistently called out the CO₂ tracking as something they'd actively use in sustainability reporting. The numbers are real direct rerouting vs. warehouse routing saves a measurable, attributable amount of emissions per return.
Return Loop was built to turn the most expensive moment in ecommerce the moment a customer gives up on a product into the most intelligent one.
Log in or sign up for Devpost to join the conversation.