Joule: Charge AI for its electricity bill

Architecture diagram

Inspiration

Every LLM API call costs money and emits carbon, but most of those calls don't need the largest available model. A one-sentence email summary doesn't need a frontier-grade model — whether that's GPT-5.5, Claude Opus 4.7, Gemini, or anything else at the top of the leaderboard. Yet developers rarely think about this trade-off per call, because the routing logic is annoying to write — and the savings only land if the routing is automatic and provider-agnostic.

We wanted the lowest-friction way for any existing OpenAI client to start getting smart routing — and to see the carbon savings as they accumulate.

Joule's slogan — "Charge AI for its own electricity bill." — captures the core idea: the model that runs should be billed (in CO₂ and dollars) for what it actually consumed, and the user should see that bill in human terms.

What it does

Joule is a Carbon-aware AI Gateway. It exposes an OpenAI-compatible /v1/chat/completions endpoint on localhost:3001. Any existing OpenAI client integrates by changing one line — the base_url.

For every request, Joule:

Classifies the user's intent via Nemotron Nano (~10ms): summarize, code, reasoning, etc.
Routes via the DecisionLayer to the smallest sufficient model — summarize → Nemotron Nano, everything else → Nemotron Super.
Calls Crusoe Managed Inference.
Measures carbon and cost Defensively — if Crusoe sends an X-Carbon-grams response header, that value is used; otherwise a static per-model lookup table. The source label is always preserved.
Writes the call log to SQLite (WAL mode) and returns an OpenAI-compatible response.

The Dashboard (Next.js, localhost:3000) shows cumulative carbon, cost, and the Super/Nano mix updated in real time.

The Hermes Agent is a natural-language interface over the same call log. The user types "How much did we save this week?" or "What are the top 3 most expensive calls?". A 3-step agent loop runs: Planner (Super) decides which of 5 tools to call → Executor runs it (SQLite read) → Responder (Super) summarizes the result in English.

Crucially, Hermes's own LLM calls go through Joule's gateway, so the agent measures its own carbon footprint as it works — a small self-reference loop.

How we built it

The project is a TypeScript monorepo with three independent processes sharing a SQLite database:

Joule core (Hono on Node 20) — Gateway, Routing, Inference adapter, Carbon meter, Storage.
Dashboard (Next.js 14 app router + recharts) — reads joule.db directly via better-sqlite3.
Hermes Agent — CLI binary and dashboard chat UI, both backed by the same agent loop.

We followed test-driven development throughout: 48 unit tests across 9 files, plus 6 live verify-shot bash scripts that exercise each demo cut end-to-end against real Crusoe Nemotron calls.

Specific technical decisions:

Model IDs (nano-30b-a3b, super-120b-a12b) are Joule-internal; the Real adapter translates them to Crusoe's catalog IDs only at the HTTP boundary.
The intent classifier combines a fast keyword pre-filter with an LLM fallback. The pre-filter catches obvious cases for demo reliability; the LLM handles ambiguous ones.
The Hermes Responder runs on Super (not Nano), because Nano occasionally returned empty content on larger tool JSON inputs during testing. The trade-off (+3-5 s per chat) is worth it for demo correctness.
Carbon measurement is Defensive: we don't pretend to know the carbon if Crusoe doesn't tell us, but we always have a labeled best-effort estimate. The label (source: "static" vs "header") is visible in the dashboard.

Challenges we ran into

Crusoe model ID mismatch. Our internal IDs didn't match Crusoe's catalog. Initial live calls returned 404. Fixed at the HTTP boundary with a translation map in the Real adapter.
Hermes Responder instability on Nano. Larger JSON tool results sometimes produced empty content from the Nano model. Switched to Super for the Responder step.
Next.js bundling better-sqlite3. Native module + Next.js webpack didn't get along — __filename was rewritten to undefined and crashed on the bindings package. Resolved via next.config.mjs webpack externals + inlining the schema SQL.
Windows process management. pkill doesn't exist; replaced with taskkill /F /IM node.exe. PowerShell's bash was being resolved to a non-installed WSL bash. Added a profile alias to Git Bash.
Demo video text legibility. The first version of the verify-shot scripts only printed JSON. We added an English narrative block (cut title + step-level OK lines + PASS summary) so anyone watching the recorded terminal can follow what each shot proves.

Accomplishments that we're proud of

A real OpenAI-compatible gateway integrated in one line of client code.
AutoModelSelection works against real Crusoe Nemotron in production — not a mock.
The Defensive carbon measurement label (source: static | header) is a small idea but, we think, worth standardizing across carbon-aware tooling.
Hermes routing its own LLM calls through Joule is the agent equivalent of dogfooding — the carbon meter applies to its own decisions.
Six days, solo, with TDD. 48 unit tests + 6 live verify-shot scripts, all green.

What we learned

Defensive carbon measurement (label the source, never guess silently) is a better contract than trying to estimate everything.
A small keyword pre-filter + LLM fallback is a more reliable production pattern than pure LLM intent classification.
The biggest demo risk in a 6-day hackathon is "the day-of recording" — pre-built verify scripts with self-explanatory output saved at least an hour.
Self-reference (agent measures its own carbon) is a useful framing for any carbon-aware infrastructure layer.

What's next for Joule

Time-of-day routing — queue non-urgent calls into low-carbon hours of the grid.
Multi-region carbon — pick the lowest-carbon Crusoe region per call.
Streaming responses — we currently proxy non-streaming completions only.
Hermes autonomous cron — weekly report sent Sunday 9am via Gmail SMTP (today: manual trigger).
Standardize X-Carbon-grams — publish a small RFC for the response header so other gateways can adopt it.
Open-source the conversion table — build a community-maintained per-model carbon estimate dataset.

Built With

ai-agent
better-sqlite3
crusoe
hackathon
hono
llm
nemotron
next.js
node.js
openai
python
react
recharts
sqlite
typescript
undici
vitest

Updates

Jamie Tyra started this project — May 26, 2026 03:26 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.