Inspiration

Listening to music together online is surprisingly broken.

Today, the only reliable way to sync music between people is through Discord bots or screen sharing—both of which are clunky, limited, and unintuitive. Bots rely on text commands, break easily, and offer little interaction beyond basic playback. Screen sharing sacrifices audio quality and removes any sense of immersion.

At the same time, high-quality music visualizers are almost entirely locked behind desktop software or outdated effects. There are no modern, web-based visualizers designed for live, shared experiences.

Audiolyze.AI was created to solve both problems at once: reliable shared music playback and immersive visuals, directly in the browser, with zero setup.


What it does

Audiolyze.AI is a web-based platform that lets people listen to music together in real time, with synchronized playback and immersive 3D visuals.

A host starts a stage by uploading a track or providing a SoundCloud link or playlist. Audience members can instantly join and hear the exact same audio at the same time, with automatic synchronization handling seeks, pauses, playback speed, and late joins.

Instead of static visuals, Audiolyze features a fully real-time 3D visualizer that reacts to both the music and playback state. Visuals evolve naturally as the song progresses, making the experience feel closer to a live performance than a passive stream.

This creates a new kind of shared experience, somewhere between a livestream, a visual performance, and a listening party.


How I built it

The backend is built with FastAPI and handles audio analysis, room management, synchronization, and SoundCloud ingestion. Audio features such as beat timing and energy curves are extracted using Librosa to drive structured visual transitions.

The frontend is built with React and Three.js using @react-three/fiber. A custom GPU-optimized visualizer renders particle systems and geometric scenes that respond in real time to music playback. The Web Audio API is used for frequency analysis, EQ tuning, and precise synchronization.

A real-time room system ensures that all listeners stay in sync, even when joining mid-song or recovering from playback drift.

Everything runs entirely in the browser—no plugins, downloads, or special software required.


Challenges I ran into

The biggest challenge by far was the visualization itself.

Audio is extremely noisy data. Even when beats and energy are detected correctly, mapping that information to visuals in a way that feels intentional, smooth, and musical is very difficult. Small mistakes quickly turn into visuals that feel chaotic, repetitive, or disconnected from the song.

I spent a significant amount of time tuning how different audio features influence visual intensity, motion, color, and transitions. The goal was to make the visuals feel like they are progressing with the music, not simply reacting to loudness. Getting that balance right required constant iteration and subjective testing across many different types of songs.

Synchronization was another major challenge. Even slight playback drift between listeners can break the illusion of a shared experience, so I implemented periodic sync snapshots and drift correction to keep audio and visuals aligned across clients.

Performance also mattered. Rendering complex 3D scenes in real time while processing audio required careful optimization to maintain smooth frame rates in the browser.


What I learned

This project taught me how difficult it is to turn raw audio data into something that feels expressive and intentional.

I learned that good visualizations are less about reacting to every signal and more about choosing what to ignore. Designing visuals that feel musical required restraint, smoothing, and structure rather than raw reactivity.

I also gained experience using generative AI as a creative decision layer. Using Gemini to translate audio features into higher-level visual direction helped bridge the gap between noisy data and coherent visual progression.

Most importantly, I learned how powerful the browser has become as a platform for building real-time, AI-driven creative experiences.


Why it matters

Audiolyze opens the door to a new type of web-native music streaming platform.

Creators—such as lo-fi streamers, DJs, or visual artists—can host always-on stages that people drop into, discover new music, and meet others organically. Listeners aren’t just consuming content; they’re sharing a moment.

This goes beyond playlists or passive streams. It’s about creating spaces where music, visuals, and people come together seamlessly.


Prize Track Justification

CXC AI - Best Audio AI Hack I applied because Audiolyze uses AI to interpret musical structure and drive expressive visual behavior rather than simple audio-reactive effects.

CXC AI - Most Creative Data Visualization I applied because the project transforms abstract audio features into evolving 3D environments instead of traditional charts or graphs.

MLH - Best Use of Gemini I applied because Gemini is used as a creative layer that turns low-level audio analysis into high-level visual intent and progression.


Built With

  • Languages: Python, JavaScript
  • Frontend: React, Three.js, @react-three/fiber
  • Backend: FastAPI
  • Audio Analysis: Librosa, NumPy
  • Web Audio: Web Audio API
  • Visualization: Custom GPU-optimized particle and geometry systems

Built With

Share this project:

Updates