About the project

We saw the potential in a vision to give today’s AI arms and legs so it can interact with the world. At the very least, we wanted to build a framework that lets it interact with the internet, and to make that ability easy to use, we planned a browser with a built-in chat interface.

Cobalt is an Electron-based desktop browser. With an AI sidebar, you can describe a task in natural language and the agent will browse pages, type, click, and extract the information you need. You can also turn on macro recording, perform a routine once, and replay it with a single click. Depending on your needs for speed, cost, and reasoning complexity, you can choose among multiple LLMs (OpenAI, Google, Anthropic).

How we built it

We used Electron with Chromium. The agent is written in TypeScript, uses Playwright for control and CDP for deep inspection, and talks to multiple LLM providers through LangChain. We send the model a compact, index-addressed view of the page and validate its JSON actions before execution. Our DOM serialization approach was inspired by browser-use to keep state small, stable, and easy to target.

What we learned

Constrained actions and summarized page state work better than raw HTML and free-form outputs. Provider differences matter, so adapters and schema validation are essential. We also tried supporting “unlimited context,” but cost and LLM input limits made it impractical. That taught us the importance of context optimization.

Challenges we faced

Early on, we explored controlling the browser with screenshot pipelines and coordinate-based clicking, asking how an AI could reliably act on real pages. Those approaches proved brittle (layout shifts, DPI/zoom, scroll offsets), so we pivoted to a DOM-based controller inspired by browser-use: we serialize interactive elements, the LLM returns validated JSON actions, and Playwright/CDP executes them. This gave us stable targeting, safer execution, and replayable automation.

What’s next

We see near-limitless potential for this browser. Our roadmap is open-ended—if we can imagine it, we can add it: starting from simple bookmarks and history, all the way to custom extensions and site-specific skills.

Built With

Share this project:

Updates