Inspiration
Software projects rarely slow down because of code; they slow down because of miscommunication. In cross-functional meetings, teams often speak different languages:
Designers describe user intent and aesthetics.
Engineers hear feasibility constraints and technical debt.
Product Managers hear scope risks and timeline implications.
Traditional meeting notes capture what was said, but not what was meant. I built SpecBridge to bridge these gaps in real-time. My inspiration was to create a tool that doesn't just transcribe, but translates vague requirements into explicit constraints, options, and decision-forcing questions.
What it does
Software projects rarely fail because of code; they fail because of miscommunication. In cross-functional meetings, teams often speak different languages:
Designers describe user intent and aesthetics.
Engineers hear feasibility constraints and technical debt.
Product Managers hear scope risks and timeline implications.
Traditional meeting notes capture what was said, but not what was meant. I built SpecBridge to bridge these gaps in real-time. My inspiration was to create a tool that doesn't just transcribe, but translates vague requirements into explicit constraints, options, and decision-forcing questions.
SpecBridge is a real-time meeting assistant that listens to voice or text input and instantly converts it into structured, role-specific specifications.
Live Alignment Engine: Inputs raw text or audio and instantly outputs structured "Alignment Cards."
Role Translation: Automatically generates distinct one-sentence summaries and acceptance criteria tailored specifically for Designers, Engineers, and PMs.
Conflict Detection: Analyzes input intent to flag requirements as Aligned, Potential Conflict, or Conflict Detected.
Engineer Deep Mode: A specialized feature that translates requirements between engineering sub-roles.
- Backend Mode: Generates API contracts and database schemas.
- AI Mode: Focuses on evaluation sets, latency budgets, and safety guardrails.
- Frontend Mode: Outputs component hierarchy and state logic.
Meeting Output & PDF Reports: Automatically compiles the entire session into a structured "Meeting Output Summary" tab. Users can download a PDF report containing role-based rollups, decision logs, and action items.
Smart Language & Visuals:
- Auto-Language Detection: Seamlessly handles mixed Korean/English input without manual switching.
- Context-Aware Visuals: The "AI Visual Canvas" adapts its output based on the active role (e.g., showing architecture blocks for Engineers vs. UI flows for Designers).
Negotiation Copilot: Converts subjective objections into clear trade-offs and alternatives.
How we built it
SpecBridge is built on Next.js 14 (App Router) and utilizes Gemini 3 Flash (gemini-3-flash-preview) as its core deterministic reasoning engine.
The core logic flow can be described as a function transforming unstructured multimodal input into structured schemas: f(Audio + Context) -> [Gemini 3] -> JSON_Schema
To achieve reliable real-time updates, I focused on three technical pillars:
Strict JSON Enforcement: I enforced responseMimeType="application/json" combined with a strict JSON schema. This bypasses conversational unpredictability, forcing the model to output rigid, UI-ready structures directly from raw audio transcripts.
Stateless Analysis Route: Data is processed in the /api/analyze route. The integration handles Multimodal Translation (converting ambiguous requirements into distinct mental models) and Conflict Resolution (detecting ambiguity and generating alignment scores).
Quasi-Real-Time Loop: Instead of complex WebSockets, I implemented a sliding window approach with backpressure management (queue size 1). This ensures a "Live Alignment" experience where the UI updates continuously without hitting rate limits or losing context.
Challenges we ran into
Balancing Creativity vs. Structure: I needed Gemini to be creative enough to detect subtle interpersonal conflicts but rigid enough to return a JSON object that wouldn't break the React frontend. Fine-tuning the schema validation was critical.
Context Switching in Deep Mode: Ensuring the model correctly pivoted from "Frontend" concerns (CSS tokens) to "Backend" concerns (DB Migrations) based on a single variable required precise prompt engineering within the system instruction.
Accomplishments that we're proud of
I am particularly proud of Engineer Deep Mode. Seeing the AI accurately context-switch technical outputs—generating specific API contracts for a backend engineer and then switching to evaluation sets for an AI engineer from the same verbal input—was a major milestone. I successfully turned a generic meeting summarizer into a domain-specific technical architect.
What we learned
I learned that Gemini 3 Flash is exceptionally capable of adhering to complex schemas without sacrificing reasoning speed. I also discovered that "misalignment" in meetings often follows predictable patterns, which allowed me to categorize conflicts (Scope vs. Feasibility) algorithmically.
What's next for SpecBridge
Full Streaming Support: Moving from the current sliding window approach to full WebSocket-based streaming for lower latency.
Visual Generative Canvas: Implementing the "AI Visual Canvas" fully to generate architectural diagrams alongside text specs.
Integration: Building plugins for Google Meet and Slack to bring SpecBridge directly into the workflow.
Log in or sign up for Devpost to join the conversation.