Inspiration

Most AI agent tools today are built on human-centered infrastructure — chat interfaces, linear pipelines, manual coordination. When you need multiple AI agents to collaborate on a complex task like writing a research report, you're duct-taping them together with Python scripts or rigid automation chains. We asked: what if the infrastructure was designed for AI-to-AI interaction from the ground up? What if agents could be visually composed, wired together, and orchestrated as coordinated workflows — with the human in an editorial role, not an operational one?

## What it does

Weave is an AI-native agent orchestration platform. You type a goal — like "write a research report on electric aviation" — and an AI architect (meta-agent) designs a multi-agent workflow on a visual canvas. The workflow typically includes 10-14 specialized agents: planners, parallel researchers, section writers, quality critics with revision loops, and a packager. Every agent's prompt is fully editable. You can rewire connections, swap LLM providers per node (GPT-4o, Claude, Gemini, Grok), and compare multiple scenario configurations side by side — including a cost-optimized variant. Hit Run and watch agents execute in parallel waves with real-time visual feedback. The output is a professionally typeset LaTeX PDF, PowerPoint deck with charts, or Markdown document.

## How we built it

We built Weave with Next.js 15 (App Router), React Flow for the visual canvas, and Zustand for state management. The backend uses OpenAI's GPT-4o for the meta-agent and GPT-4o-mini for agent execution, Tavily API for real-time web search in researcher nodes, and Tectonic (a Rust-based LaTeX compiler) for PDF generation. The execution engine uses a custom topological wave-based executor that groups independent agents by dependency depth and runs them concurrently via Promise.all. Server-Sent Events stream real-time status updates to the frontend. We designed a two-attempt LaTeX compilation strategy — if the first compile fails (due to LLM-hallucinated syntax), we strip problematic elements and retry, guaranteeing a PDF output every time.

## Challenges we ran into

The biggest challenge was making LLM-generated LaTeX reliably compilable. LLMs frequently hallucinate invalid pgfplots syntax, reference non-existent image files, or forget to escape special characters like & and %. We built a sanitization pipeline that strips hallucinated \includegraphics references, escapes bare ampersands, and falls back to compiling without charts if the first attempt fails. Another challenge was designing the dynamic graph topology — moving from fixed 4-template layouts to fully dynamic graphs where the meta-agent freely designs 10-14 node topologies with chained research, multiple makers, and dual critic loops required rethinking the entire graph positioning algorithm.

## Accomplishments that we're proud of

We're proud of the wave-based parallel execution engine — seeing 5 researcher agents light up simultaneously with spinning gradient borders and completing in a third of the sequential time is genuinely satisfying. The multi-scenario comparison feature, where users can switch between Cost-Optimized, Mix A, and Mix B configurations and see different LLM assignments with real-time cost estimates, demonstrates the platform's potential as true infrastructure rather than a wrapper. The LaTeX PDF output quality — with table of contents, pgfplots charts, booktabs tables, and proper bibliography — rivals documents that take hours to typeset manually.

## What we learned

We learned that the meta-agent prompt is the product. The quality of the generated workflows depends almost entirely on how well the orchestrator prompt guides the LLM to design specific, tailored agent configurations. We also learned that multi-LLM orchestration is fundamentally a graph problem, not a chat problem — agents need structured data flow, dependency resolution, and parallel execution, which maps naturally to directed acyclic graphs with topological ordering. Finally, we learned that editability is the key differentiator: users trust AI workflows far more when they can see and modify every prompt, not when the system is a black box.

## What's next for weave

Next, we plan to integrate real multi-provider API calls (actually routing to Claude, Gemini, and Grok instead of display-only labels), implement the feedback loop where user ratings on scenario outputs feed back into the meta-agent's model selection decisions, and add execution history with A/B comparison views. We're also exploring agent-to-agent communication protocols where critic agents can negotiate with maker agents in real-time, and support for custom agent roles that users can define beyond the current five. Long-term, Weave aims to be the infrastructure layer that any AI application can plug into for multi-agent orchestration.

Built With

  • claude
  • claudecode
  • codex
  • gemini
  • grok
  • nextjs
Share this project:

Updates