Inspiration

We love tabletop role-playing games (TTRPGs) and know how much atmosphere matters in a great session. Music can make or break an encounter, but constantly managing soundtracks can distract a Dungeon Master (DM) from storytelling. We wanted to create an AI-driven assistant that listens to the DM and dynamically generates fitting background music to enhance immersion.

What it does

Dungeon DJ listens to the DM in real time, transcribes their speech into text, and analyzes it to determine the mood or theme of the scene. It then generates an adaptive soundtrack using Meta’s MusicGen model and plays it continuously, updating as the game progresses. This ensures that the music always fits the moment—whether it's a tense battle, an eerie dungeon crawl, or a triumphant victory.

How we built it

Speech-to-Text Conversion: We convert the DM’s speech into text and log it into an output.txt file. Text Classification: Using a NaiveBayesClassifier machine learning model, we filter out relevant sentences that indicate the scene’s mood or action. Music Generation: The processed text is then fed into Meta’s MusicGen, which generates an AI-composed soundtrack tailored to the scene. Continuous Playback: The generated track loops until a new one is created, ensuring seamless background audio throughout the game. Tech Stack: We built Dungeon DJ entirely in Python, leveraging Django for the backend and Hugging Face's MusicGen API for music generation.

Challenges we ran into

Speech Accuracy: Ensuring the speech-to-text conversion correctly captures fantasy-specific terms and proper nouns used in D&D. Text Filtering: Training the NaiveBayesClassifier to reliably distinguish relevant descriptions from casual table chatter. Latency: Reducing the delay between scene changes and new music generation to maintain immersion. Looping Audio Seamlessly: Avoiding abrupt transitions when switching between tracks. Accomplishments that we're proud of Successfully integrating real-time speech recognition with AI-generated music. Training a classifier that accurately extracts meaningful descriptions from DM narration. Creating an adaptive, hands-free tool that enhances storytelling without distracting the DM.

What we learned

How to fine-tune a NaiveBayesClassifier for text classification in a live application. The intricacies of working with Hugging Face's MusicGen API. Optimizing Python performance for real-time applications. The importance of seamless UX when designing AI-driven tools. What's next for Dungeon DJ Customizable Music Styles: Allowing DMs to set preferred musical themes (e.g., orchestral, synthwave, medieval). Improved Speech Recognition: Enhancing accuracy for fantasy-specific terms. Voice-Based Commands: Letting DMs manually trigger scene transitions with simple phrases. Cloud-Based Version: Enabling Dungeon DJ to run online for remote D&D sessions. Expanded AI Integration: Exploring ways to dynamically adjust tempo and instrument choice based on game intensity.

Built With

  • django
  • huggingface
  • musicgen
  • textblob
Share this project:

Updates