Inspiration

Every founder has the same problem: keeping up with competitors
is a full-time job nobody has time for. Tools like ChatGPT's Deep Research can generate a one-shot report, but it's static — a snapshot that's stale by tomorrow. We wanted to build something that keeps watching while you sleep. An agent that doesn't just research, but builds a living knowledge graph, deploys autonomous monitors, and reacts to breaking news in real-time.

How we built it

Orbit runs an autonomous loop: discover → extract → reason → evolve.

  1. Discovery — Tavily web search with diversified seed queries (competitor scans, startup discovery via Product Hunt/YC, funding rounds)
  2. Extraction — OpenAI gpt-4.1-mini extracts structured entities (companies, people, funding events, products) and free-form strategic signals from search results. A dedicated relationship extraction pass cross-references new entities against the full graph
  3. Reasoning — gpt-4o analyzes each cycle's findings in the context of the existing knowledge graph, generating threat scores, insights, action items, and new queries to investigate next
  4. Deep Analysis — After all cycles, a final pass finds hidden cross-source connections, market gaps, money trails, and time-bound predictions
  5. Action — Orbit auto-deploys Yutori Scouts to monitor top threats 24/7. Breaking intel can be injected at any time — the agent extracts entities, updates relationships, recalculates threat scores, and generates new strategic actions

The knowledge graph lives in Neo4j AuraDB. The backend is FastAPI with async Neo4j drivers. The frontend is a single-file dark-theme dashboard with vis-network for interactive graph visualization, real-time polling, and alert banners for high-threat discoveries.

Challenges

  • Graph connectivity — Early versions produced mostly orphaned nodes. We iterated through multiple extraction strategies: adding a dedicated cross-cycle relationship extraction pass, upgrading from gpt-4o-mini to gpt-4.1-mini (which dramatically improved extraction quality), and filtering noise from customer name-drops and unrelated industries
  • Data quality vs. breadth — Startup-focused queries pulled in articles like "55 AI startups that raised $100M" which flooded the graph with irrelevant companies (xAI, Rivian). We added domain context and relevance filtering to the extraction prompt to keep the graph focused
  • Performance — Each cycle made 4 sequential OpenAI calls. We parallelized entity and signal extraction with asyncio.gather(), batched Neo4j writes with UNWIND, and reduced cycles from 5 to 3 — cutting total runtime from ~5 minutes to ~2 minutes
  • Status dict mutation bug — The agent's status dict was being reassigned (agent_status = {}), which broke the reference held by FastAPI. Switching to .update() fixed a subtle bug where the inject-intel endpoint thought the agent was always running

What we learned

The biggest insight: the graph structure is the product, not the text. LLMs can summarize web pages — that's table stakes. What they can't do is maintain a structured knowledge graph that you can query, simulate against, and evolve over time. The moment Orbit connects an investor from Cycle 1 to a startup from Cycle 3 that no single source mentioned together — that's when it stops being a research tool and becomes an intelligence system.

Built With

Share this project:

Updates