Inspiration
Personal development with AI APIs has never been easier thanks to AI-powered coding tools like Claude Code, Cursor, and Windsurf. More developers than ever are spinning up agents and integrating LLMs into their projects. But with that ease of use comes a hidden problem: runaway costs. When your autonomous agent is making hundreds of API calls in the background, it's shockingly easy to burn through your budget without realizing it. We built TokenGauge to provide a simple, privacy-respecting solution to keep tabs on what our AI integrations are actually costing you before the invoice arrives.
What it does
TokenGauge tracks key API metrics across multiple providers (OpenAI, Anthropic, and Google Gemini) through a lightweight SDK that wraps your existing client. It gives small dev teams and individual developers a unified dashboard to monitor API key performance without juggling separate provider consoles. Beyond visibility, TokenGauge allows you to set limits for your api keys and provides SMS notifications when your agents are burning through tokens at an unusual rate, so you can intervene before a rogue loop drains your credits. All of this works without ever sending your API keys or prompt content to our servers; only metadata like token counts, latency, and model name are logged.
How we built it
We built TokenGauge with a Python backend using FastAPI, connected to MongoDB Atlas for storing API usage data. The frontend is a React dashboard for data visualization. Orignally we thought to do a gateway proxy, but some providers lacked proper API endpoints for usage metrics. Instead built our own SDK dependency from scratch to interface with them directly. We attempted to add a Redis caching layer to reduce redundant polling and send sms notifcations to alert users for unusual or extreme token usage.
Challenges we ran into
Since several providers didn't expose proper API endpoints for usage metrics, we built our own SDK dependency to interface with them directly. We also faced challenges implementing SMS notifications for high token usage alerts and optimizing our Redis caching layer to minimize redundant API calls.
Accomplishments that we're proud of
We successfully built and integrated our first SDK dependency from scratch. We're also proud of the robustness of our test coverage and the data visualization dashboard, which makes the various API metrics intuitive and easy to digest at a glance.
What we learned
How to build and publish an SDK dependency from scratch, how to work around missing API endpoints creatively, the challenges of real-time data polling at scale, and how to structure a multi-provider monitoring architecture.
What's next for Token Gauge
We plan to get Twilio approved to enable SMS notifications for usage alerts. We're also exploring training ML models, including deep forest and neural network architectures, to classify requests and estimate costs. In the longer term, our goal is to build a cost optimization engine that can auto-route requests for maximum efficcieny in cost and token usage.
Log in or sign up for Devpost to join the conversation.