Inspiration
Every developer using AI APIs faces the same problem: you have no idea how much you're actually spending. Cache hits vs. misses, key rotation inefficiencies, model pricing differences — it's a black box.
We experienced this firsthand. Running Claude Opus 4.6 through an API proxy with 5,000+ keys, we discovered that cache misses were costing us 12x more per request ($1.09 vs $0.09). But there was no tool to visualize this or suggest fixes.
TokenWise was born from this pain.
What it does
TokenWise is an open-source CLI + Web Dashboard that gives developers full visibility into their AI API costs:
- Cost Breakdown — See exactly where your money goes: input tokens, output tokens, cache creation vs. cache hits
- Cache Intelligence — Monitor cache hit rates in real-time, get alerts when rates drop below threshold
- Model Comparison — Compare cost-per-task across Claude, GPT, Gemini with your actual usage data
- Optimization Engine — Actionable suggestions like "Switch to Sonnet for this workflow to save 60%" or "Enable sticky sessions to boost cache hits from 2% to 90%"
- Budget Alerts — Set spending limits, get notified before you blow through them
How we built it
- CLI Core: Node.js — parses API usage logs from OpenAI, Anthropic, and custom providers
- Web Dashboard: Single-page HTML + Chart.js — clean, responsive cost visualization
- Analysis Engine: Pattern detection for cache optimization, model routing suggestions
- Export: JSON/CSV reports for team sharing
Challenges we ran into
- Different API providers use completely different billing formats — normalizing them into a unified schema was the biggest engineering challenge
- Cache behavior varies wildly between direct API calls and proxy services
- Balancing between showing raw data and actionable insights
Accomplishments that we're proud of
- The cache optimization suggestion alone saved us $30+/day on our own setup
- Clean, zero-dependency web dashboard that runs anywhere
- Works with any OpenAI-compatible API proxy (NewAPI, OneAPI, etc.)
What we learned
- AI API pricing is far more complex than "input + output tokens" — caching, batching, and key management create hidden cost multipliers
- Most developers are overpaying by 3-10x due to poor cache utilization
- Open-source cost tooling is a massive gap in the AI developer ecosystem
What's next for TokenWise
- Real-time streaming dashboard (WebSocket)
- Team/org cost allocation
- Auto-router: automatically select the cheapest model that meets quality thresholds
- Plugin for popular AI frameworks (LangChain, OpenClaw, etc.)
Log in or sign up for Devpost to join the conversation.