Inspiration

Software projects rarely slow down because of code; they slow down because of miscommunication. In cross-functional meetings, teams often speak different languages:

  • Designers describe user intent and aesthetics.

  • Engineers hear feasibility constraints and technical debt.

  • Product Managers hear scope risks and timeline implications.

Traditional meeting notes capture what was said, but not what was meant. I built SpecBridge to bridge these gaps in real-time. My inspiration was to create a tool that doesn't just transcribe, but translates vague requirements into explicit constraints, options, and decision-forcing questions.

What it does

Software projects rarely fail because of code; they fail because of miscommunication. In cross-functional meetings, teams often speak different languages:

  • Designers describe user intent and aesthetics.

  • Engineers hear feasibility constraints and technical debt.

  • Product Managers hear scope risks and timeline implications.

Traditional meeting notes capture what was said, but not what was meant. I built SpecBridge to bridge these gaps in real-time. My inspiration was to create a tool that doesn't just transcribe, but translates vague requirements into explicit constraints, options, and decision-forcing questions.

SpecBridge is a real-time meeting assistant that listens to voice or text input and instantly converts it into structured, role-specific specifications.

  • Live Alignment Engine: Inputs raw text or audio and instantly outputs structured "Alignment Cards."

  • Role Translation: Automatically generates distinct one-sentence summaries and acceptance criteria tailored specifically for Designers, Engineers, and PMs.

  • Conflict Detection: Analyzes input intent to flag requirements as Aligned, Potential Conflict, or Conflict Detected.

  • Engineer Deep Mode: A specialized feature that translates requirements between engineering sub-roles.

    • Backend Mode: Generates API contracts and database schemas.
    • AI Mode: Focuses on evaluation sets, latency budgets, and safety guardrails.
    • Frontend Mode: Outputs component hierarchy and state logic.
  • Meeting Output & PDF Reports: Automatically compiles the entire session into a structured "Meeting Output Summary" tab. Users can download a PDF report containing role-based rollups, decision logs, and action items.

  • Smart Language & Visuals:

    • Auto-Language Detection: Seamlessly handles mixed Korean/English input without manual switching.
    • Context-Aware Visuals: The "AI Visual Canvas" adapts its output based on the active role (e.g., showing architecture blocks for Engineers vs. UI flows for Designers).
  • Negotiation Copilot: Converts subjective objections into clear trade-offs and alternatives.

How we built it

SpecBridge is built on Next.js 14 (App Router) and utilizes Gemini 3 Flash (gemini-3-flash-preview) as its core deterministic reasoning engine.

The core logic flow can be described as a function transforming unstructured multimodal input into structured schemas: f(Audio + Context) -> [Gemini 3] -> JSON_Schema

To achieve reliable real-time updates, I focused on three technical pillars:

  1. Strict JSON Enforcement: I enforced responseMimeType="application/json" combined with a strict JSON schema. This bypasses conversational unpredictability, forcing the model to output rigid, UI-ready structures directly from raw audio transcripts.

  2. Stateless Analysis Route: Data is processed in the /api/analyze route. The integration handles Multimodal Translation (converting ambiguous requirements into distinct mental models) and Conflict Resolution (detecting ambiguity and generating alignment scores).

  3. Quasi-Real-Time Loop: Instead of complex WebSockets, I implemented a sliding window approach with backpressure management (queue size 1). This ensures a "Live Alignment" experience where the UI updates continuously without hitting rate limits or losing context.

Challenges we ran into

  • Balancing Creativity vs. Structure: I needed Gemini to be creative enough to detect subtle interpersonal conflicts but rigid enough to return a JSON object that wouldn't break the React frontend. Fine-tuning the schema validation was critical.

  • Context Switching in Deep Mode: Ensuring the model correctly pivoted from "Frontend" concerns (CSS tokens) to "Backend" concerns (DB Migrations) based on a single variable required precise prompt engineering within the system instruction.

    Accomplishments that we're proud of

    I am particularly proud of Engineer Deep Mode. Seeing the AI accurately context-switch technical outputs—generating specific API contracts for a backend engineer and then switching to evaluation sets for an AI engineer from the same verbal input—was a major milestone. I successfully turned a generic meeting summarizer into a domain-specific technical architect.

What we learned

I learned that Gemini 3 Flash is exceptionally capable of adhering to complex schemas without sacrificing reasoning speed. I also discovered that "misalignment" in meetings often follows predictable patterns, which allowed me to categorize conflicts (Scope vs. Feasibility) algorithmically.

What's next for SpecBridge

  • Full Streaming Support: Moving from the current sliding window approach to full WebSocket-based streaming for lower latency.

  • Visual Generative Canvas: Implementing the "AI Visual Canvas" fully to generate architectural diagrams alongside text specs.

  • Integration: Building plugins for Google Meet and Slack to bring SpecBridge directly into the workflow.

Built With

Share this project:

Updates