Crucible

Inspiration

The Problem (Stateless RAG Amnesia): Traditional Retrieval-Augmented Generation (RAG) is stateless. Every query searches files, retrieves chunks, and answers from scratch. Knowledge never compounds, leading to expensive, redundant compilation, and conversational "amnesia" (context drift) where the agent forgets early constraints as the chat scrollback grows.
The Solution (LLM Wiki & OKF): We were inspired by the LLM Wiki pattern (popularized by Andrej Karpathy), which shifts the paradigm from stateless search to stateful, compiled knowledge. Rather than querying raw documents, the LLM compiles and refines ideas into a persistent, human-readable wiki. We structured this using Google Cloud's recent Open Knowledge Format (OKF) specification, saving the active canvas state as a standardized Markdown file (idea_bubble.md) with clean YAML-like metadata. This acts as the shared, portable, and version-controlled database between the founder and the co-thinking agent.

Crucible is a stateful co-thinker that compiles, organizes, and pressure-tests early-stage startup ideas on a live reasoning canvas.

Compounding Assumption Capture: Instead of losing context, Crucible continuously refines the idea into a structured canvas. Each response compiles new parameters into Core Claim, Target User, Key Assumptions, and Scope cards.
Continuous Linting (Contradiction Detection): Crucible constantly audits the knowledge structure for design integrity. If the user introduces an assumption that contradicts an existing canvas node (e.g., claiming "zero server infrastructure" while requiring "real-time multi-user editing"), the system flags the conflict and draws an animated red tension line between the nodes.
Grounded Competitor Analysis: Crucible runs a Google Search grounding query to fetch active web products in the same domain and compiles a comparative edge analysis.
Traceable Planning: The compiled canvas compiles directly into a validation roadmap where every task traces back (traces_to) to a specific canvas node ID.
Time-Bounded Verdicts: The system offers concrete next-move options (Go/No-Go decisions) but leaves the final action choice strictly to the human builder.

Backend: Built with FastAPI (Python) and Uvicorn. We used SQLite to store session states, active markdown representations, and snapshot histories for rollbacks.
Frontend: Built with React, TypeScript, React Flow (for the interactive canvas), Tailwind CSS, and Framer Motion (for smooth, staggered screen transitions).
AI Core: Powered by the new Google GenAI SDK (google-genai) using the gemini-2.5-flash model. We utilized structured JSON outputs to return chat messages and canvas delta operations simultaneously, and used the native Google Search Grounding tool to query the live web.

Structured Output Parsing: Getting the LLM to consistently return valid JSON containing both natural language chat and structured canvas delta operations (creates, updates, deletes) required careful schema definition and error handling.
Real-Time Canvas Synced to SSE Streams: We had to synchronize the React Flow canvas updates with the Server-Sent Events (SSE) streaming chat tokens without causing UI stuttering.
Gemini Quota Limits: Managing API rate limits on the free tier during rapid testing required us to write a robust backend fallback layer that catches errors and serves smart mock data to keep the interface functional.

Visualizing Contradictions: Building the animated custom TensionEdge component that draws dashed red connection lines between conflicting ideas on the canvas.
Shared Markdown State: Successfully implementing a bi-directional sync where direct edits to canvas nodes serialize back into markdown, which the AI then reads on the next turn.
Premium UX/UI: Designing a dark-mode, developer-centric interface with glassmorphism, responsive panels, and smooth micro-animations.

Contextual Grounding: We learned that the power of grounding AI contextually is unmatched. When you and the model share a visual, structured document (a shared understanding), the quality of collaboration increases exponentially. This alignment is critical to achieving maximum success with AI partnerships.

Multi-Agent Research: Introducing specialized research agents that spin up in the background to automatically scrape the web, read competitor APIs, and validate your assumptions.
Cross-Session Memory: Allowing the system to save multiple canvases, cross-reference them, and warn you if a new idea competes with a project you mapped out last week.
Interactive Node Tracing: Connecting the roadmap steps directly to the canvas so that clicking a step automatically pans and highlights the specific canvas node it tests.