Project Story

Inspiration

Music streaming has become incredibly transactional. We open an app, tap a flat, unmoving thumbnail, and instantly relegate art to background noise. We wanted to bring back the nostalgic magic of late-night space radio hosts and the hypnotic, satisfying allure of old-school media player visualizers (like Winamp or iTunes Classic), but completely modernized for the web.

We asked ourselves: What if your music player didn't just play tracks, but built an entire responsive universe around them? Inspired by the vast aesthetics of interstellar space, futuristic sci-fi UI design, and generative AI, we created ASTRA—a sanctuary where audio frequency, 3D particle physics, and a conversational AI identity merge into a living, breathing cosmic radio station.


What it does

ASTRA is a full-screen, single-page music ecosystem that transforms digital audio into an active, cinematic ritual. At its center spins a high-fidelity vinyl player displaying real-time album artwork, backed by a fully interactive, mouse-parallax Three.js starfield.

+-----------------------------------------------------------------+
|  [Logo / ASTRA]                                 [Search 🔍]     |
|                                                                 |
|   (( Central Vinyl ))      [Glassmorphism Lyrics Panel]         |
|       (( Disc ))           "Floating across the event..."       |
|                            (Smooth-scroll auto centering)       |
|    |||| ||| |||| ||||                                           |
|   [Waveform Visualizer]    [AI DJ ASTRA Chat Bubble]            |
|   (Dynamic glow color)     "Engaging hyperdrive cores..." 💬     |
|                                                                 |
|  [▶] [⏮] [⏭] [🔀]  01:42 =======o======= 04:15  🔊 [===o===]    |
+-----------------------------------------------------------------+
| BACKGROUND LAYER: Interactive 3D Starfield & Nebula Particles   |
+-----------------------------------------------------------------+

Key features include:

  • The AI Space DJ: Powered by Google Gemini and browser Text-to-Speech (TTS), ASTRA acts as an ethereal interstellar radio host. She speaks to you, generating smooth, context-aware audio commentaries whenever you boot the app, skip tracks, or load up a fresh search result.
  • Dynamic Mood-Driven Realities: Instead of standard genre buttons, you can type how you feel into a natural language input (e.g., "calm late-night coding" or "high-energy kinetic warp speed"). ASTRA analyzes the prompt, dynamically reorders your playlist queue, and transforms the physical space—morphing the colors and star-velocity parameters instantly.
  • Real-Time Audio Matrix: An audio-reactive waveform visualizer glows and pulses around the central player using live frequency data streams. Concurrently, a lyric panel auto-fetches, parses, and scrolls time-synced .lrc files exactly to the millisecond.

How we built it

We designed ASTRA as a high-performance single-page app utilizing modular vanilla JS engines combined with lightweight serverless functions to keep the browser responsive.

  • The Universe (Graphics Engine): Built with Three.js, managing thousands of individual star points and glowing nebula clouds. The star velocity along the $Z$-axis and camera parallax offsets are tied into a centralized requestAnimationFrame render loop keeping the frame rate locked at a smooth 60fps.
  • The Voice & Curation (Generative AI): Orchestrated by Google Gemini endpoints via secure proxy connections. The model takes user text inputs, maps them to deterministic mood categories, and crafts custom script payloads for the browser's native SpeechSynthesis engine.
  • The Audio Pipeline (Web Audio API): We routed standard HTML5 Audio through a specialized internal pipeline, extracting raw audio data for visual rendering:
graph LR
    AudioElement[HTML5 Audio Element] --> MediaSource[MediaElementSourceNode]
    MediaSource --> Analyser[AnalyserNode]
    Analyser --> Destination[AudioContext.destination]

  • The Cloud Framework: Built out custom Supabase Edge Functions to securely proxy third-party requests, shielding API keys and normalizing incoming streams from the ccMixter API (music catalog) and LRCLIB.net (synchronized lyrics).

Challenges we ran into

The Autoplay Browser War

Modern browsers strictly block programmatic audio playback and AudioContext tracking without deliberate, intentional user actions. Early on, our Web Audio graph would silently freeze on boot, resulting in an unmoving visualizer. We solved this by developing a robust synchronization lifecycle that forces a hidden initialization routine, triggering audioContext.resume() on any native user interaction (such as toggling the playback button or selecting a track).

Eliminating UI Layout Thrashing

Streaming fast-fourier transforms ($FFT$) requires capturing array snapshots at highly rapid frequencies. Initially, passing these byte arrays through traditional React state channels caused massive component tree re-renders, causing severe frame drops. We completely decoupled the visualizer canvas from the React rendering engine, updating the canvas drawing context directly through local reference nodes instead of state hook bindings.

CORS and Cross-Origin Media Disconnects

Streaming direct .mp3 source binaries and .lrc text formats from distributed public APIs routinely triggered strict Cross-Origin Resource Sharing blocks, ruining audio visual analysis. We resolved this by building robust serverless proxy tunnels inside our Supabase functions, injecting standard crossOrigin="anonymous" headers directly into all incoming media assets.


Accomplishments that we're proud of

  • Fluid Spatial Transitions: Creating a smooth transition matrix where visualizer color hex codes, background nebula particles, and star speeds blend seamlessly over a 2–3 second curve whenever a new mood is introduced.
  • Mathematical Precision: Constructing the synchronized lyrics parser. It extracts raw .lrc text strings and maps time-stamps using the following linear transformation to ensure perfect alignment with the audio element's currentTime:

$$T = (M \times 60) + S + \left(\frac{H}{100}\right)$$

Where \(M\) is minutes, \(S\) is seconds, and \(H\) is hundredths of a second.

  • Compelling Persona Design: Tailoring the Gemini prompts so that ASTRA feels genuinely like an authentic, calming companion who understands your musical journey, rather than a generic, robotic corporate assistant.

What we learned

  • Asynchronous Audio Orchestration: Managing media lifecycles requires absolute discipline. We learned how to neatly synchronize asynchronous operations—making sure the audio track gracefully ducked its volume while the TTS engine spoke, and seamlessly returned to full volume once ASTRA completed her dialogue.
  • Edge Architecture Superiority: Keeping token processing and data sanitization away from the client-side app via serverless edge layers vastly lowers operational memory overhead.
  • Low-Level Web Audio Manipulation: Getting hands-on with frequency bin arrays (getByteFrequencyData) provided an excellent lesson in processing raw byte streams into smooth, beautiful graphics.

What's next for ASTRA AI-DJ

The cosmos is infinitely expanding, and we want ASTRA's capabilities to grow with it:

  • Procedural Ambient Soundscapes: Integrating procedural synthesizers that weave low-frequency space wind, interstellar hums, or pulsar clicks matching your mood to fill the gaps between music tracks.
  • Syllable-Level Lyric Tracking: Upgrading from line-by-line scrolling to an ultra-precise karaoke-style text highlighting framework that responds directly to sudden shifts in tempo.
  • Camera Particle Interaction: Implementing computer-vision frameworks that let users warp the 3D starfield, trigger bass bursts, or alter visualizer parameters through simple real-world hand gestures.

Another Demo Video: https://www.youtube.com/watch?v=7-Sbi8BbRxs

Built With

Share this project:

Updates