AUTONOMY

Conversation context
Editing conversation history
Gemma context flagging
Extension

Inspiration

Using AI agents like Claude Code, developers are "blind" to the payload being sent to the API. Each new prompt quietly includes all prior interactions, tool outputs, intermediate steps, and repeated context. Over time, this accumulated context fills with stale data and noise. This drags down efficiency, degrades response quality, increases hallucinations, and steadily pushes requests toward hard context limits. Currently, developers only realize there is a problem when they spend hours trying to solve a bug or receive a massive API bill.

What it does

Autonomy acts as a local proxy that intercepts API calls and gives the developer full control to manage their context.

Full Control: The ability to edit and delete every prompt or agent response within a conversation. An "Ask-for-Permission" mode, where the proxy holds the request until you approve or edit it, ensuring you never send more tokens than necessary.

Interactive Bar Graph: Every message, tool call, and tool output is represented as a bar. The height represents the token count, and colors identify the section type.

Trim & Delete: Users can click or shift-click bars to mark them for deletion, instantly seeing the estimated dollars saved on the live cost meter.

Gemma-Powered Intelligence: A local Gemma 4 E4B model runs in the background to analyze the payload for redundancies, flagging wasteful sections with red badges and suggesting specific text removals within a built-in text editor.

How we built it

Frontend:

React 19 + TypeScript, built with Vite 8

VS Code Extension:

TypeScript
Websockets

Backend:

Fast API,
Backboard.io
Ollama + Gemma4

Challenges we ran into

Latency vs. Intelligence: We had to develop a Three-Layer Rendering Model to ensure the UI felt instant. The chart renders immediately from JSON data, while the heavier Gemma-powered analysis "lights up" the bars a few seconds later.

Streaming Stability: Ensuring that Server-Sent Events (SSE) passed through our proxy without buffering was critical to keep Claude Code’s UI feeling responsive

Bottlenecked By Gemma: Gemma was just running too slow when it came to the individual analysis so we had to get creative in how we used it.

Testing our work: This was the first time any of us made a VS Code extension and actually getting everything running and working was a bit of a learning curve.

Accomplishments that we're proud of

We are proud that we got an MVP for a tool that we can actually use. All of us started using the tool during the building process because it is genuinely useful and provides a unique solution to the idea.

What we learned

Lots about coding agents and how they work. We learned lots about the limitations of a proxy service like this, as well as how we can push the boundaries to do something really cool. We learned to look at things in a new angle, and taking a much more creative approach to solving given the theme.

What's next for Autonomy

Working with all coding agents / LLM's: Being compatible with every time of coding agent

Word Suggestions: Using an NLP model to suggest ideal types of prompts and a better RAG system to contextualize what type of information is relevant to the project

Orchestrator Agents: Giving this tool to an overarching agent can allow it to edit subagent contexts with greater accuracy, allowing for better communication and optimized use