EchoBrain AI Router

Inspiration

We kept seeing big models used for small problems, which wastes compute, money, and carbon. We sought a straightforward approach to make sustainability the default. EchoBrain predicts how hard a task really is and matches it to the smallest model that still gets the job done, then shows the impact clearly.

What it does

EchoBrain takes a prompt, asks Gemini to judge the task as easy, medium, or heavy, and then compares three lanes called Fast, Balanced, and Deep. The app estimates energy in kilowatt hours and emissions in grams of CO₂, shows a value per kilowatt hour score as a quality to energy ratio, highlights adaptive scaling savings, and suggests practical efficiency moves like quantization or distillation. It also recommends the lane that offers the best value for the predicted difficulty.

How we built it

The backend is a small FastAPI service that calls the AI API with a strict JSON schema and computes per lane metrics using transparent constants. Tokens are estimated with a simple character-to-token rule that you can override. Emissions come from energy multiplied by a configurable carbon intensity. The frontend is a lightweight page with Tailwind and Chart.js that shows a classification card, side-by-side lane summaries, two clear bar charts, and a short recommendation.

Challenges we ran into

Getting Gemini to return clean JSON every time required careful prompts and defensive parsing. Balancing credibility with simplicity in the UI took iteration so that judges could see the trade-offs in seconds. We also had to make the math both honest and tunable since energy and quality vary by hardware and model choice.

Accomplishments that we're proud of

We built an end-to-end router that makes sustainability visible and actionable, not a footnote. For easy tasks, the Fast lane often cuts energy by roughly sixty to eighty-five per cent compared with always choosing Deep, while maintaining suitable quality. The project runs locally with a single command and keeps working even if the external API hiccups, thanks to a fallback classifier.

What we learned

Right-sizing models gives the biggest wins. Value per kilowatt hour tells a story that nontechnical stakeholders understand immediately. Carbon intensity can vary a lot by region and time, so scheduling matters. Clear contracts and fallbacks are essential for a smooth live demo.

What’s next for EchoBrain AI Router

We plan to plug in live carbon signals for carbon aware scheduling, replace estimates with measured energy from telemetry where possible, add a simple policy layer for organizational rules like carbon or budget caps, automate common efficiency steps such as quantization or distillation, and integrate with popular LLM gateways so teams can drop this into real workflows.