TuneTools - Your Day, Your Song
Inspiration
We've all had those days where everything just clicks - or where everything falls apart. What if your daily experiences could become a song? That's the question that sparked TuneTools.
The idea came from a simple observation: we're surrounded by data about our lives - news we read, weather we experience, events on our calendar - but it's all just... information. We wanted to transform that data into something emotional, something you could actually feel. Music has this incredible power to capture moments and moods in ways words alone can't, so we thought: why not let AI turn your day into a personalized soundtrack?
What it does
TuneTools takes the context of your day and generates a unique song just for you. Here's how it works:
- Gather your context: The app pulls in your local weather, recent news headlines, and upcoming calendar events
- Understand the vibe: An LLM analyzes all this information and figures out what kind of song would capture your day - the genre, the mood, the story
- Generate your song: Using the YuE AI music generation pipeline, we create an actual audio track with lyrics that reflect your personal context
- Add the artwork: We generate custom album cover art to match your song's theme
- Share it: Get a shareable link to your daily tune that you can send to friends or save for later
The result? A 30-60 second song that's uniquely yours, capturing a snapshot of your life in musical form.
How we built it
This was a journey of integrating some seriously cool AI technologies:
Frontend: We built a React + TypeScript interface with Tailwind CSS for a clean, modern look. The UI needed to feel calm and inviting - after all, music should be relaxing, not stressful.
Backend: Python FastAPI handles all the heavy lifting. We integrated multiple APIs (news, weather, calendar) and orchestrated the entire generation pipeline.
The AI Stack:
- YuE Models: We deployed a three-stage music generation pipeline in runepod (7B parameter model for understanding, 1B for generation, plus an upsampler for quality). This was the heart of the project.
- LLM Processing: Used language models to translate raw context data into musical specifications - genre tags and structured lyrics
- Gemini: For generating those album covers that make each song feel complete
Infrastructure: The biggest challenge was deployment. We used RunPod's serverless GPU infrastructure to handle the compute-intensive music generation. The models alone are 18.5GB, so we implemented smart caching strategies to keep costs reasonable.
Database: Supabase for user management, song storage, and sharing functionality.
Challenges we ran into
Model Loading Times: Our first test run took 12 minutes just to download the models. We quickly realized we needed persistent storage and lazy loading strategies. After optimization, we got warm starts down to about 7 minutes.
Memory Management: Running three large AI models in sequence on GPU infrastructure meant we had to be really careful about memory. We implemented model unloading between stages to avoid OOM errors.
Context Quality: Getting the LLM to generate good musical specifications from random news and weather data was tricky. We spent a lot of time refining prompts to ensure the genre tags and lyrics actually made sense together.
Cost Control: GPU time isn't cheap. We had to balance between keeping workers warm (for faster response) and letting them idle out (to save money). Each song costs about $0.09-0.23 to generate.
Audio Format Handling: Dealing with base64 encoding, different audio formats, and ensuring playback worked across browsers was more complex than expected.
The "It Works on My Machine" Problem: Docker deployment revealed all sorts of environment-specific issues we hadn't encountered locally.
Accomplishments that we're proud of
- It actually works! Generating music from scratch using AI is no small feat, and we pulled it off.
- End-to-end pipeline: From raw data to playable audio with artwork - the whole experience is seamless.
- Smart caching: Our model loading strategy reduced generation time by 40% after the first run.
- Real personalization: The songs genuinely reflect the input context. When it's rainy and your calendar is packed, you get a different vibe than on a sunny, relaxed day.
- Shareable moments: The sharing feature means these aren't just throwaway generations - they're keepsakes you can revisit.
- Clean architecture: Despite the complexity, we kept the codebase organized and maintainable.
What we learned
Technical lessons:
- Serverless GPU computing is powerful but requires careful resource management
- Model quantization and optimization are crucial for practical AI applications
- Audio processing has a lot of gotchas (sample rates, formats, encoding)
- Prompt engineering for creative tasks is an art form in itself
Product lessons:
- Context matters - the same AI can produce wildly different results based on how you frame the input
- Users want fast results - 7 minutes feels like forever in today's world
- The combination of multiple AI systems (LLM + music generation + image generation) creates something greater than the sum of its parts
- Sometimes the best features are the simplest ones - people just want to hit a button and get their song
Process lessons:
- Documentation is your friend when integrating complex systems
- Testing with real data reveals issues you'd never catch with mock data
- Infrastructure decisions early on can save (or cost) you hours later
What's next for TuneTools
We have big dreams for where this could go:
Short term:
- Faster generation: Exploring model optimization and better GPU utilization to get under 2 minutes
- More context sources: Spotify listening history, social media sentiment, fitness data
- Music styles: Expand beyond the current genre options to include more diverse musical styles
- Collaboration: Let friends contribute to each other's daily songs
Medium term:
- Mobile app: Native iOS/Android apps for on-the-go generation
- Playlist creation: Automatically compile your daily songs into weekly or monthly playlists
- Lyrics customization: Let users tweak the generated lyrics before final generation
- Voice integration: Add optional vocal synthesis for singing the lyrics
Long term:
- Real-time generation: Generate background music that adapts as your day unfolds
- Community features: Discover songs from people having similar days
- API access: Let other developers build on top of our music generation pipeline
- Longer compositions: Move beyond 30-60 seconds to full-length songs
The core vision remains the same: turn the data of your life into music you can feel. We're just getting started.
Log in or sign up for Devpost to join the conversation.