Inspiration
Music genres aren’t fixed — they constantly blend and evolve. We wanted to explore what happens if you could control that blending in real time. Instead of relying only on prompt engineering, we asked: how can we measure whether a genre blend actually worked?
What it does
GenreBlender lets users:
- Select two genres
- Adjust a blending slider (α ∈ [0,1])
- Generate a new AI-composed track
- See how closely the result matches the intended blend
We define the target blend as:
Target = (α · A) + ((1 − α) · B)
Our classifier predicts:
P(genre_i | audio)
We compare the predicted distribution to the target, making genre blending measurable and controllable.
How we built it
1. Generative Engine
We used Meta’s MusicGen to generate tracks from weighted prompts like:
“70% classical, 30% hip hop.”
2. Neural Genre Classifier
- Trained for 100 epochs on the GTZAN dataset using a Multilayer Perceptron with 4 layers (3 hidden, 1 output), achieving a 92% accuracy onthe Validation Set
- Extracted MFCCs, RMS, harmonic/percussive features, tempo
- Used StandardScaler + LabelEncoder
- Prevented leakage with GroupShuffleSplit
MLP architecture:
nn.Linear(input_size, 256)
ReLU
Dropout(0.3)
nn.Linear(256, 128)
ReLU
Dropout(0.3)
nn.Linear(128, 64)
ReLU
Dropout(0.3)
nn.Linear(64, num_classes)
Pipeline:
- User selects genres + α
- MusicGen generates audio
- Features extracted
- Classifier predicts genre probabilities
- Results returned to frontend
Challenges we ran into
- CUDA/GPU compatibility issues
- Ensuring identical preprocessing during training and inference
- Preventing data leakage in audio splits
- Optimizing generation speed for real-time demos
Accomplishments that we're proud of
- Built both a generative model interface and a custom evaluation model
- Achieved 92% validation accuracy
- Created a closed-loop controllable generation system
- Made music blending interactive, interpretable, and quantifiable
What we learned
- Prompt engineering alone isn’t enough for controllable AI
- Evaluation models improve interpretability
- Preventing data leakage is critical
- Thinking in systems (generation + evaluation) leads to stronger AI applications
What's next for GenreBlender: Generative AI Music Mixer & Classifier
- Use embedding-level interpolation instead of prompt-only blending
- Add more genres and larger datasets
- Introduce divergence metrics to quantify blend accuracy
- Deploy a scalable GPU-hosted version
Built With
- joblib
- mlp
- numpy
- pandas
- python
- pytorch
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.