Inspiration

Every developer using AI APIs faces the same problem: you have no idea how much you're actually spending. Cache hits vs. misses, key rotation inefficiencies, model pricing differences — it's a black box.

We experienced this firsthand. Running Claude Opus 4.6 through an API proxy with 5,000+ keys, we discovered that cache misses were costing us 12x more per request ($1.09 vs $0.09). But there was no tool to visualize this or suggest fixes.

TokenWise was born from this pain.

What it does

TokenWise is an open-source CLI + Web Dashboard that gives developers full visibility into their AI API costs:

  • Cost Breakdown — See exactly where your money goes: input tokens, output tokens, cache creation vs. cache hits
  • Cache Intelligence — Monitor cache hit rates in real-time, get alerts when rates drop below threshold
  • Model Comparison — Compare cost-per-task across Claude, GPT, Gemini with your actual usage data
  • Optimization Engine — Actionable suggestions like "Switch to Sonnet for this workflow to save 60%" or "Enable sticky sessions to boost cache hits from 2% to 90%"
  • Budget Alerts — Set spending limits, get notified before you blow through them

How we built it

  • CLI Core: Node.js — parses API usage logs from OpenAI, Anthropic, and custom providers
  • Web Dashboard: Single-page HTML + Chart.js — clean, responsive cost visualization
  • Analysis Engine: Pattern detection for cache optimization, model routing suggestions
  • Export: JSON/CSV reports for team sharing

Challenges we ran into

  • Different API providers use completely different billing formats — normalizing them into a unified schema was the biggest engineering challenge
  • Cache behavior varies wildly between direct API calls and proxy services
  • Balancing between showing raw data and actionable insights

Accomplishments that we're proud of

  • The cache optimization suggestion alone saved us $30+/day on our own setup
  • Clean, zero-dependency web dashboard that runs anywhere
  • Works with any OpenAI-compatible API proxy (NewAPI, OneAPI, etc.)

What we learned

  • AI API pricing is far more complex than "input + output tokens" — caching, batching, and key management create hidden cost multipliers
  • Most developers are overpaying by 3-10x due to poor cache utilization
  • Open-source cost tooling is a massive gap in the AI developer ecosystem

What's next for TokenWise

  • Real-time streaming dashboard (WebSocket)
  • Team/org cost allocation
  • Auto-router: automatically select the cheapest model that meets quality thresholds
  • Plugin for popular AI frameworks (LangChain, OpenClaw, etc.)

Built With

Share this project:

Updates