Inspiration

This app was inspired by the need of a personalized listening experience, something that could turn a daily commute with my son into a narrated musical session to listen and learn. Tuning to the radio stations in the car was getting TOO boring!

What it does

AuraStream is the next evolution in music consumption. You provide a topic, a few song references to set the mood, and choose your AI host character.

The app then generates a unique listening session where the host guides you through the music.

It's music on autopilot. This is more than creating a playlist, it's a fully narrated experience with music discovery, interesting trivia, motivational context, etc, all tailored to your initial idea and song choice. So you can sit back and enjoy the show, or interact with it in real-time to change the vibe or topic.

How we built it

We built AuraStream as a React PWA, with Supabase handling our backend and Gemini 2.0 plus Google Cloud TTS and ElevenLabs.

It uses a very complex AI orchestration pipeline:

  • Planning: When a user enters a prompt, we use Gemini to analyze the request and generate a structured episode outline. This outline is editable, giving users different levels of autonomy: from full auto-pilot to fine-grained control.

  • Scripting & Execution (ReAct Loop): For each segment of the session (an intro, a transition, etc.), we use a reasoning on a loop. The AI first observes the current state of the session (played segments, user feedback), evaluates the best course of action, and then acts by generating a compelling script with Gemini.

  • Voice Synthesis: The generated script is then sent to high-quality TTS APIs like ElevenLabs and Google Cloud to create a natural, engaging AI host voice.

  • Playback: We use the Spotify Player SDK to control music playback, seamlessly synchronizing songs with the AI host's narration.

Challenges we ran into

The biggest challenge wasn't just technical, but strategic: navigating platform constraints. I realized in the end that Spotify's SDK has a very specific and restrictive set of accepted use cases.

Building a monetized, standalone app that layers a new experience on top of their service would likely violate their terms. This forced a difficult decision: instead of trying to build a commercial product, I would present AuraStream as a non commercial proof of concept. This allowed me to focus purely on demonstrating the power of the idea and the technology, without getting shut down by platform rules.

With that decision made, I focused on the technical hurdles:

  • The orchestration of asynchronous AI services was incredibly complex. Getting the LLM to generate a script, sending it to a TTS platform like ElevenLabs, and timing the audio playback perfectly with music cues from the Spotify SDK required a robust state management system.

  • I also tackled the "blank canvas" problem. Users often don't know what they want. I overcame this by designing the input system to generate a compelling session from even the vaguest of inputs like "chill vibes", and letting the user go on from there.

  • Finally, making the LLM generate compelling and adaptive scripts was difficult. A simple prompt wasn't enough. The implementation of the ReAct loop was my solution, allowing the characters to feel intelligent and responsive to user feedback and not only reading a predefined text.

Accomplishments that we're proud of

The biggest one is that my 8 year old son uses it every day. He's excited about generating sessions, finding some rare songs (even the wild picks by the AI are a fun surprise!), and learning about prompting and communicating clearly with the host.

What we learned

I learned that prompt engineering is not just about a single good prompt, but designing a conversational flow that guides the AI toward a desired outcome.

I also learned that LLMs are most useful when given a clear task to work on, and a way to observe and iterate on its progress. Plus, the right UI for an AI will turn painstakingly writing a long description into a simple, intuitive, and efficient experience.

Finally, I learned about the practical limitations of relying on third-party music platforms. The constraints of the Spotify SDK made me realize that for a project like this to truly scale, it needs to eventually move towards direct music licensing.

What's next for AuraStream

  • Integrate a some sort of 'feeds' feature, for sessions to incorporate real-time information, like "today's tech news with a synthwave soundtrack."

  • Add a way to 'Regenerate' an episode in the same style and music vibe but different content. This could be a good shortcut for quick playback of preferred sessions.

  • Break the platform dependency. My biggest long-term goal is to secure funding for music licensing, allowing AuraStream to operate independently and unlock the full potential of dynamic audio generation without relying on external playback services. I want to show the world how AI can be used to make music listening a more personal, interactive, and meaningful experience.

Built With

Share this project:

Updates