Inspiration
Standard productivity playlists are static and blind to the emotional weight of a story. RadioFlow was born from the idea that prose should lead a living, reactive ensemble, with music that adapts to your story, draws you in, and keeps you in flow.
What It Does
Using Google Gemini 3 Flash, RadioFlow analyzes your narrative in real-time and generates a 16-step "Musical DNA," modulating rhythm, harmony, and instrumentation. The result: functional, adaptive music that evolves with your story, keeping you fully immersed and creatively engaged.
✍️ Distraction-Free Workspace: A minimalist editor designed to facilitate deep cognitive immersion.
🎺 Generative Ensemble: A live-synthesized soundscape featuring soulful trumpet, bowed contrabass, and electric keys that "improvise" based on your story.
🧠 Semantic AI Sync: Powered by Gemini 3 Flash, the system tracks narrative intensity, tension, and valence to dynamically adjust the musical expression.
🔁 The Flow Loop: Your writing shapes the music, and the music—enhanced with Alpha-wave Binaural Beats—sustains your focus.
How I Built It
RadioFlow is built as a Generative AI Orchestrator. The core innovation is the bridge between Gemini 3’s semantic understanding and the Tone.js browser synthesis engine.
The application runs on a high-performance, low-latency loop:
Input: The React editor monitors user interaction.
Analysis: A Web Worker processes the text to avoid UI lag, triggering Gemini 3 Flash via Supabase Edge Functions.
Context & Directives: Gemini 3 Flash performs real-time semantic analysis by comparing current prose against historical narrative snapshots and the live audio engine state. It generates a structured 'Musical DNA' (JSON schema) that directs the audio engine to modulate rhythms, harmonies, and instrument solos. By seeing the current audio state, Gemini 3 evolves the music naturally instead of regenerating it, preserving momentum and avoiding abrupt resets.
Rhythmic Patterns: It generates 16-step binary sequences (0s and 1s) for the drums (Kick, Snare, Hi-Hat, Rim).
Melodic Instructions: It generates 16-step sequences of scale degrees (numbers representing notes) for the Bass, Trumpet, Guitar, and Lead instruments.
Harmonic Context: It decides the BPM (tempo) and the Musical Mode (e.g., "Dorian") based on the "Emotional Trajectory" of your story.
Synthesis: The AudioService (a Tone.js singleton) receives the Musical DNA and performs voice-leading, timbre-morphing, and sequence updates in real-time:
No Pre-recorded Loops: The engine uses Tone.Sequence to look at the DNA's 16 steps. If step 4 has a "1" for the kick drum, the engine triggers a kick sample at exactly that millisecond.
Dynamic Mapping: If the AI says a melody note is "degree 3" and the mode is "Minor," the engine calculates the frequency for a Minor 3rd. If the AI shifts the mode to "Lydian," that same "degree 3" instantly becomes a Major 3rd.
Human-like Feel: The engine adds "Jitter" (micro-timing variations) and dynamic velocity (volume changes) so it doesn't sound like a robotic metronome.
Output Schema
{
"intensity": 0-1.0,
"complexity": 0-1.0,
"evolutionPath": "Two words describing the shift (e.g., 'Tension Rising', 'Glassy Dissolve')",
"musicalAction": "SUSTAIN|BUILD|BREAK_DOWN|DROP|MODULATE|FLIP|DISSOLVE",
"rationale": "One sentence explaining why this musical action fits the narrative arc.",
"engineState": { "bpm": 60-100, "scaleMode": "DORIAN|PHRYGIAN|LYDIAN|MINOR|MAJOR" },
"patternDNA": {
"commentary": "Brief conductor note.",
"kick": [16 binary steps],
"snare": [16 binary steps],
"hihat": [16 binary steps],
"openhihat": [16 binary steps],
"rim": [16 binary steps],
"tom": [16 binary steps],
"bass": [16 scale degrees, 0=rest],
"contrabass": [16 scale degrees, 0=rest],
"guitar": [16 scale degrees, 0=rest],
"lead": [16 scale degrees, 0=rest],
"trumpet": [16 scale degrees, 0=rest],
"vibes": [16 scale degrees, 0=rest],
"atmos": [16 chordal indices, 0=rest]
}
}
Tech Stack
Intelligence: Google Gemini 3 (Flash) via Supabase Edge Functions.
Audio Engine: Tone.js (Web Audio API)
Infrastructure: Supabase (Auth, PostgreSQL, Edge Functions).
Frontend: React 18, TypeScript, Tailwind CSS (v4).
Challenges I Faced
The Main-Thread Bottleneck: Real-time synthesis and AI calls can be CPU-intensive. I solved this by offloading text processing to Web Workers and using a "staggered" update system for the audio nodes.
The Semantic-to-Music Gap: Translating abstract emotions (like "melancholy" or "rising tension") into specific musical modes (Dorian vs. Phrygian) required careful Tone.js tuning and a robust mapping system.
Accomplishments & Learnings
In-Browser High Fidelity: I successfully built a stable, low-latency closed-loop feedback system that achieves a professional, cinematic sound.
Semantic Orchestration: Gained deep experience in using Large Language Models not just for text generation, but as sophisticated creative decision-makers for complex systems.
Emergent Musical Continuity: Discovered Gemini 3’s musical awareness. Given the current audio state, it evolves the music naturally, producing transitions that feel intentional and coherent.
Next Steps
Ensemble Expansion: Moving beyond the current sounds to a more diverse set of presets. I plan to add woodwinds, extra percussion layers, and ambient textures, giving the “Conductor” a richer palette to evolve the music.
(Future improvements may explore personalizing music to individual writing flow.)
Built With
- gemini
- lucidereact
- react
- supabase
- tailwindcss
- tone.js
- typescript
Log in or sign up for Devpost to join the conversation.