Inspiration
Council came from a problem we kept noticing while using AI agents: they often do not fail because they know nothing, they fail because they make reasonable-sounding assumptions without telling anyone. A coding agent might choose a database, auth strategy, deployment setup, or architecture pattern before the user has actually provided enough context.
That felt like a very agent-native problem. Humans usually pause and ask for another opinion before making a high-impact decision, but agents often just continue because stopping too often makes them less useful. We wanted to explore a middle ground: what if an agent could consult other specialized agents before acting?
Council is our answer to that idea. It is a thinking-phase tool for agents. When an agent reaches an important assumption point, Council convenes a small group of advisors, lets them disagree, and returns a structured judgment that the original agent can use.
What it does
Council shows the difference between a normal single-pass AI response and a multi-agent deliberation layer.
In the demo, an AI coding agent is building a web app and is about to assume PostgreSQL as the database. Instead of silently moving forward, the agent calls Council.
Council then runs a structured deliberation:
- A baseline model gives the kind of fast answer an agent might normally produce.
- Four advisor personas respond in sequence: Strategist, Skeptic, Operator, and Psychologist.
- The advisors go through two rounds, so they can react to each other instead of just giving isolated opinions.
- A synthesizer reads the full discussion and produces a final structured briefing.
- The final output includes a recommendation, reasoning, dissent, biggest risk, confidence level, and whether the agent should ask the human.
The goal is not just to make the answer longer. The goal is to make uncertainty visible before the agent acts.
How we built it
We built Council as a Next.js app with TypeScript and Tailwind CSS. The frontend is a dark, modern, infrastructure-style demo interface that visualizes the agent's thought process as a live transcript.
The backend has two main routes:
/api/reasoningfor the single-pass baseline response./api/councilfor the streamed Council deliberation.
The Council route streams events back to the browser using NDJSON, so each advisor appears as soon as their turn is ready. This made the demo feel more like a real internal agent process rather than a static response.
For the live version, we connected to CLoD through the OpenAI-compatible API client. Each Council role can use a different model, and the synthesizer produces the final structured JSON. We also kept a mock mode that mirrors the live streaming flow with realistic delays, so the demo is reliable even without using live credits during judging.
On the UI side, we focused on making the product feel like agent infrastructure rather than a chatbot. The interface uses sharp borders, terminal-like labels, dark surfaces, typewriter pacing, progress indicators, and visible JSON output to make it clear that Council is a tool an agent calls internally.
Challenges we ran into
One challenge was figuring out how to make multi-agent deliberation feel useful instead of performative. Early versions could easily become four agents politely agreeing with each other, which did not prove much. We had to tune the prompts so each advisor had a real perspective and would challenge the others directly.
Streaming was also a challenge. We wanted the UI to update turn by turn, not wait for the whole Council run to finish. That meant designing a simple event protocol, parsing streamed NDJSON in the browser, and making sure partial failures did not break the whole experience.
Another challenge was reliability. Live model APIs can fail, model IDs can change, and some models behave differently than expected. We added fallback behavior and kept mock mode as the safe default so the demo could still work consistently.
The last big challenge was presentation. The idea is abstract, so the demo had to communicate the value quickly. We spent time making the visual contrast clear: one fast baseline answer versus a slower but more accountable deliberation process.
Accomplishments that we're proud of
We are proud that Council feels like a complete product concept, not just a prompt chain. It has a clear problem, a focused use case, a working live pipeline, a polished interface, and a structured output that another agent could actually consume.
We are also proud of the streaming experience. Watching the advisors appear one by one makes the deliberation feel real and helps viewers understand that Council is not just generating a longer answer at the end.
Another accomplishment is the final briefing format. The output is not just text; it is structured judgment. It captures the recommendation, disagreement, risk, confidence, and next action in a way that could be passed back into an agent runtime.
Most importantly, the demo tells a simple story: agents should not silently guess at important assumptions. They should consult before acting.
What we learned
We learned that multi-agent systems are only useful when the agents have clearly different jobs. If every agent is just another general-purpose assistant, the output becomes repetitive. Giving each advisor a specific bias and role made the conversation much more valuable.
We also learned that showing disagreement is just as important as showing the final answer. The disagreement helps explain why the final recommendation is trustworthy and what tradeoffs shaped it.
On the technical side, we learned a lot about streaming AI responses into a frontend, building reliable fallback paths, and designing around live API uncertainty. We also learned how important demo safety is: a strong mock mode can be just as important as the live version during a hackathon.
What's next for Council
The next step would be turning Council from a demo into a tool that real agents can call. That could mean exposing it as a small API or SDK function where an agent sends an assumption and receives a structured judgment object.
We would also like to make the advisor panel configurable. Different tasks might need different councils: security reviewers for production code, product advisors for roadmap decisions, or operations-focused advisors for infrastructure changes.
Longer term, Council could become a reusable judgment layer for autonomous agents. Instead of every agent deciding when to guess, when to ask, and when to continue, Council could provide a structured way to slow down only at the moments that matter.
Built With
- clod
- css
- ndjson
- next.js
- node.js
- openai
- react
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.