Mobius: the first AI Agent to build a billion dollar company

Mobius architecture diagram
Mobius log viewer
Mobius Slack integration
Mobius's first company!
Mobius social media

Inspiration

There’s been a rise of rhetoric online about a single person building a billion-dollar company. That vision was emphasized during Sam Altman’s opening ceremony keynote, when he referenced that one AI agent could build a billion-dollar company.

We sought out to build that agent.

If one agent can build a billion-dollar company, it can’t be a chatbot. It needs memory. It needs persistence. It needs tools. It needs to execute continuously. It needs to behave like a founder. Most AI systems today are stateless assistants. They respond once and forget. Those that are multi-turn survive a few cycles before they suffer serious performance issues. Building a company requires continuity, remembering product decisions, adapting to changing requirements, debugging failures, deploying updates, and iterating over months or years without losing context.

We set out to build an AI agent that doesn’t just respond. It operates and founds a real enduring company. That became Mobius.

What it does

Mobius is a multi-turn autonomous founder agent that runs continuously, remembers context, executes real-world actions, and allows real-time human steering. In one continuous autonomous run, Mobius:

Ran for 11+ hours, completed 3,000+ turns
Executed hundreds of browser actions, deployments, refactors, and product iterations

During that session, Mobius autonomously built Paypilot, an AI-first reimagination of Rippling and Gusto that automates payroll, HR, and finance operations. Paypilot was not a mockup. Mobius:

Did market research to identify what domains would be a billion dollar company
Designed system architecture then wrote and shipped production-ready code
Integrated AI assistants into workflows
Designed payroll and payment flows, Implemented monetization logic
Drafted legal documents (Terms, Privacy, Incorporation) and emailed them to us to sign
Generated marketing content and posted it across all social media platforms for an official launch
Fixed type errors and deployment issues mid-run, Iteratively improved product constraints and dashboard features
Mobius operates through an observe → plan → act → report, ralph wiggum loop, with interruption support.

Core capabilities:

Run on a modal sandbox for privacy and asynchronous running
Continuous execution loop (not one-shot prompting), with multi-turn collaboration Structured state persistence across thousands of turns, efficient context retrieval through:
Elastic Vector DB (Jina AI embeddings) for long-term founder knowledge and understanding how to build a company
Convex to store realtime trajectories and full observability in the agent
Local Filesystem for structured context of subagents and procedural steps
Real-time human steering via Slack or CLI, Automatic reprioritization mid-execution
Mobius does very well in unblocking itself, like by making its own accounts to get keys for Google, Stripe, etc.
Durable logging of decisions and tool calls
Specialist subagents (researcher, coder) for parallel work
Real browser automation powered by BrowserBase for real-world workflows

Every turn inherits structured context from previous turns. Memory is explicitly persisted so long runs do not drift or forget earlier decisions.

The difference between a chatbot and a founder is 24/7 continuity and persistence. Mobius has these qualities.

How we built it

Our architecture is designed around durable, multi-turn autonomy:

Orchestration Layer

Agent calls are run through Google Cloud’s Vertex AI
Claude Agent SDK for structured tool calling and agent control loops
TypeScript + Bun runtime for fast execution

Control Plane

Slack Socket Mode for real-time human steering
CLI interface for local operation

Execution Layer

Modal for isolated cloud sandbox execution
Safe reproducible environments for code shipping
Parallel subagent spawning

Browser Layer

Browserbase MCP for real browser automation (navigation, interaction, extraction)

State & Telemetry

Convex for durable state, structured event logs, and run inspection
Retrieval layer (Elastic + Jina) for long-term founder knowledge and understanding how to build a company
Local file system for understanding sub-states and overall high level context

Execution traces recorded turn-by-turn

The system explicitly separates planning, acting, and reporting phases, and persists structured memory between each turn to prevent context loss across long runs.

Challenges we ran into

Maintaining context consistency across thousands of turns, dealing with exploding context that quickly blew past token limits and led to context rot was also an issue
Preventing state drift in long autonomous sessions
Having the agent pickup from where it left, so we can modify the code and not lose progress
Balancing autonomy with human control
Managing type-check and monorepo conflicts during iterative builds
Designing interruption logic without breaking execution flow
Long-running agents expose architectural weaknesses quickly. We learned that autonomy requires robust control systems — not just good prompts.

Accomplishments that we're proud of

Sustained an 11+ hour autonomous run with 3,000+ turns
Built and shipped a full-stack AI-first payroll platform autonomously
Implemented true interruptible autonomy (not simulated turn chaining)
Integrated real browser automation and sandboxed code execution
Created durable, inspectable execution logs for debugging
Maintained high autonomy without losing operator oversight

Mobius didn’t just generate plans. It shipped.

What we learned

The biggest unlock is smarter control loops in conjunction with context engineering, not bigger context windows
Multi-turn structured memory is foundational to real autonomy, we stored memory in three different ways all for different reasons
Interruption handling is critical for trust and usability
Sandbox + browser + messaging is the minimum viable stack for founder-like agents
Execution telemetry is essential for debugging autonomous systems
Context routing matters more than raw token depth

The difference between a demo and a system is persistence.

What's next for Mobius: the first AI Agent to build a Unicorn

Stronger Autonomy Controls
Budget and approval gates for risky actions
Task leasing and parallel subagent scheduling
Structured escalation paths
Business Outcome Tracking
KPI dashboards for autonomous performance
Economic benchmarking of agent-driven startups
Long-duration autonomous company builds

Our vision is to make Mobius the execution engine behind the next generation of AI-native companies. If one AI agent can build a billion-dollar company, Mobius is what that agent should look like.

Built With

ai
anthropic
chatgpt
claude
elastic
jinai
modal
python
vercel

Submitted to

TreeHacks 2026
- Winner [Y Combinator] Build an Iconic YC Company with AI (1st Place: Guaranteed YC interview 2nd Place: Guaranteed YC Office Hours 3rd Place: Guaranteed YC Office Hours)
- Winner [Modal] Sandbox Challenge