Inspiration

Creative momentum is fragile. Traditional AI interfaces introduce a "stop-and-start" friction that kills the rhythm of a brainstorm. I wanted to build a system that functions as a kinetic architectural partner - a tool where the interaction logic mirrors the speed of a physical rally, keeping the designer in a state of continuous flow.

What it does

Ping Pong Brainstorm is a voice-driven system for lateral thinking.

Continuous UX: A hands-free interface that eliminates the "click-to-submit" barrier.

Modulated Logic: A dual-state engine optimized for either single-word associations (Blip) or short, evocative conceptual frames (Flow).

Traceable Outputs: Every session generates a structured transcript, converting ephemeral speech into a persistent foundation for further building.

How I built it

The stack is centered on Gemini 2.5 Flash-Lite for its sub-second response latency.

Interface: Developed a tactile, neumorphic UI to ground the digital experience.

Audio Pipeline: Orchestrated a low-latency loop using the Web Speech API and browser based TTS for low-latency vocal returns.

Prompt Architecture: Utilized XML-delimited few-shot logic to isolate instructions from examples, ensuring the model adheres to strict "45-degree" lateral association rules.

Challenges we ran into

The primary hurdle was managing the asynchronous audio handshake. Browsers prevent concurrent microphone and speaker access to avoid feedback loops. I had to engineer a 150ms state-switching buffer to bridge the "Mouth-to-Ear" transition. Another challenge was the "Literal Trap," where small models default to synonyms. I broke this by implementing a frequency penalty and high-temperature lateral logic to force the AI away from the user’s original word.

Accomplishments that we're proud of

The successful execution of a Zero-Click interaction loop. The transition from thought to synthesized response happens without a single manual trigger. This achieves a level of technical integrity where the system fades into the background, leaving only the creative rally.

What we learned

Latency is the primary antagonist of creative flow. Optimizing for speed meant learning to prune history and move toward the Flash-Lite model, proving that a lean, fast agent is often more effective for kinetic UX than a larger, slower model.

What's next for Ping-Pong Brainstorm

The roadmap involves moving into Multimodal Spatial UX. I plan to integrate Gemini 3’s vision capabilities so the system can "see" physical sketches and spatial gestures. This would evolve the tool from a verbal partner into a holistic design architect that responds to both sight and sound.

Built With

  • agentic-"vibe-coding"-to-iterate-on-spatial-and-kinetic-interaction-logic.-voice-architecture:-web-speech-api-for-real-time-transcription-coupled-with-gemini-tts-for-human-grade
  • almost-white-aesthetic-and-custom-typewriter-rendering-for-incoming-responses.-state-management:-custom-javascript-engine-to-handle-the-150ms-asynchronous-handshake-between-microphone-and-speaker-states
  • core-model:-gemini-2.5-flash-lite-?-selected-for-its-sub-second-response-latency-to-maintain-the-high-speed-"rally"-rhythm.-prototyping-environment:-antigravity-?-used-for-rapid
  • emotive-vocal-returns.-interaction-logic:-xml-delimited-prompting-?-used-to-strictly-isolate-system-instructions-from-few-shot-examples
  • enabling-a-zero-click
  • hands-free
  • preventing-"instruction-contamination"-and-ensuring-adherence-to-the-blip/flow-constraints.-frontend-ui:-html5/css3/javascript-?-featuring-a-"cozy-modern"-neumorphic-design-system-with-a-tactile
Share this project:

Updates