Inspiration

Standard productivity playlists are static and blind to the emotional weight of a story. RadioFlow was born from the idea that prose should lead a living, reactive ensemble, with music that adapts to your story, draws you in, and keeps you in flow.

What It Does

Using Google Gemini 3 Flash, RadioFlow analyzes your narrative in real-time and generates a 16-step "Musical DNA," modulating rhythm, harmony, and instrumentation. The result: functional, adaptive music that evolves with your story, keeping you fully immersed and creatively engaged.

✍️ Distraction-Free Workspace: A minimalist editor designed to facilitate deep cognitive immersion.

🎺 Generative Ensemble: A live-synthesized soundscape featuring soulful trumpet, bowed contrabass, and electric keys that "improvise" based on your story.

🧠 Semantic AI Sync: Powered by Gemini 3 Flash, the system tracks narrative intensity, tension, and valence to dynamically adjust the musical expression.

🔁 The Flow Loop: Your writing shapes the music, and the music—enhanced with Alpha-wave Binaural Beats—sustains your focus.

How I Built It

RadioFlow is built as a Generative AI Orchestrator. The core innovation is the bridge between Gemini 3’s semantic understanding and the Tone.js browser synthesis engine.

The application runs on a high-performance, low-latency loop:

Input: The React editor monitors user interaction.

Analysis: A Web Worker processes the text to avoid UI lag, triggering Gemini 3 Flash via Supabase Edge Functions.

Context & Directives: Gemini 3 Flash performs real-time semantic analysis by comparing current prose against historical narrative snapshots and the live audio engine state. It generates a structured 'Musical DNA' (JSON schema) that directs the audio engine to modulate rhythms, harmonies, and instrument solos. By seeing the current audio state, Gemini 3 evolves the music naturally instead of regenerating it, preserving momentum and avoiding abrupt resets.

  • Rhythmic Patterns: It generates 16-step binary sequences (0s and 1s) for the drums (Kick, Snare, Hi-Hat, Rim).

  • Melodic Instructions: It generates 16-step sequences of scale degrees (numbers representing notes) for the Bass, Trumpet, Guitar, and Lead instruments.

  • Harmonic Context: It decides the BPM (tempo) and the Musical Mode (e.g., "Dorian") based on the "Emotional Trajectory" of your story.

Synthesis: The AudioService (a Tone.js singleton) receives the Musical DNA and performs voice-leading, timbre-morphing, and sequence updates in real-time:

  • No Pre-recorded Loops: The engine uses Tone.Sequence to look at the DNA's 16 steps. If step 4 has a "1" for the kick drum, the engine triggers a kick sample at exactly that millisecond.

  • Dynamic Mapping: If the AI says a melody note is "degree 3" and the mode is "Minor," the engine calculates the frequency for a Minor 3rd. If the AI shifts the mode to "Lydian," that same "degree 3" instantly becomes a Major 3rd.

  • Human-like Feel: The engine adds "Jitter" (micro-timing variations) and dynamic velocity (volume changes) so it doesn't sound like a robotic metronome.

Output Schema

{
  "intensity": 0-1.0,
  "complexity": 0-1.0,
  "evolutionPath": "Two words describing the shift (e.g., 'Tension Rising', 'Glassy Dissolve')",
  "musicalAction": "SUSTAIN|BUILD|BREAK_DOWN|DROP|MODULATE|FLIP|DISSOLVE",
  "rationale": "One sentence explaining why this musical action fits the narrative arc.",
  "engineState": { "bpm": 60-100, "scaleMode": "DORIAN|PHRYGIAN|LYDIAN|MINOR|MAJOR" },
  "patternDNA": {
    "commentary": "Brief conductor note.",
    "kick": [16 binary steps],
    "snare": [16 binary steps],
    "hihat": [16 binary steps],
    "openhihat": [16 binary steps],
    "rim": [16 binary steps],
    "tom": [16 binary steps],
    "bass": [16 scale degrees, 0=rest],
    "contrabass": [16 scale degrees, 0=rest],
    "guitar": [16 scale degrees, 0=rest],
    "lead": [16 scale degrees, 0=rest],
    "trumpet": [16 scale degrees, 0=rest],
    "vibes": [16 scale degrees, 0=rest],
    "atmos": [16 chordal indices, 0=rest]
  }
}

Tech Stack

Intelligence: Google Gemini 3 (Flash) via Supabase Edge Functions.

Audio Engine: Tone.js (Web Audio API)

Infrastructure: Supabase (Auth, PostgreSQL, Edge Functions).

Frontend: React 18, TypeScript, Tailwind CSS (v4).

Challenges I Faced

The Main-Thread Bottleneck: Real-time synthesis and AI calls can be CPU-intensive. I solved this by offloading text processing to Web Workers and using a "staggered" update system for the audio nodes.

The Semantic-to-Music Gap: Translating abstract emotions (like "melancholy" or "rising tension") into specific musical modes (Dorian vs. Phrygian) required careful Tone.js tuning and a robust mapping system.

Accomplishments & Learnings

In-Browser High Fidelity: I successfully built a stable, low-latency closed-loop feedback system that achieves a professional, cinematic sound.

Semantic Orchestration: Gained deep experience in using Large Language Models not just for text generation, but as sophisticated creative decision-makers for complex systems.

Emergent Musical Continuity: Discovered Gemini 3’s musical awareness. Given the current audio state, it evolves the music naturally, producing transitions that feel intentional and coherent.

Next Steps

Ensemble Expansion: Moving beyond the current sounds to a more diverse set of presets. I plan to add woodwinds, extra percussion layers, and ambient textures, giving the “Conductor” a richer palette to evolve the music.

(Future improvements may explore personalizing music to individual writing flow.)

Built With

Share this project:

Updates