Inspiration

Most AI agents look impressive until the first real failure: a missing API, a broken scraper, an invalid config, or a tool loop that never recovers. We built octopOS around a simple idea: an agent should not collapse when the environment gets messy. It should detect failure, adapt, repair its path, and keep moving toward a useful result.

What it does

octopOS is a self-healing AI operator that can answer user requests across multiple tools and channels. It orchestrates APIs, web search and scraping, MCP servers, and messaging interfaces like Telegram and Slack. When a tool fails or a workflow breaks, octopOS can retry, switch strategy, and continue execution instead of stopping at the first error.

How we built it

Challenges we ran into

We built octopOS as a modular orchestration runtime with tool routing, failure handling, and recovery-aware query execution. We added a curated public API catalog, MCP integration, and multi-channel interfaces for Telegram and Slack. We also introduced structured tool outputs, query state tracking, systemd-based Telegram persistence, and self-healing hooks to make the agent more reliable in real-world conditions.

Accomplishments that we're proud of

We turned octopOS from a fragile tool-calling assistant into a more resilient agent runtime. It now supports persistent Telegram access, runnable Slack integration, improved MCP onboarding, a structured public API layer, and more reliable query handling. Most importantly, we built a system that keeps trying to reach an answer even when parts of the stack fail.

What we learned

We learned that resilience is a product feature, not just an engineering detail. Self-healing requires good state management, structured tool outputs, explicit recovery logic, and honest handling of uncertainty. We also learned that multi-channel access matters a lot, because users want to interact with an AI system where they already are.

What's next for octopOS

Next, we want to make octopOS more autonomous and more robust in production. That means deeper self-healing workflows, better handling for time-sensitive and news-like queries, stronger MCP discovery and validation, richer Slack and Telegram operations, and a more general query engine that can reason across unknown tasks without relying on narrow patches.

Built With

  • aiohttp
  • apscheduler
  • aws-bedrock
  • boto3
  • ddgs
  • fastapi
  • httpx
  • lancedb
  • mcp
  • playwright
  • pydantic
  • python
  • rich
  • slack-api
  • sqlite
  • telegram-bot-api
Share this project:

Updates