Inspiration

This project draws inspiration from the K-pop Demon Hunters movie—all rights reserved to the creators. If you haven't seen it yet, it's definitely worth watching!

What it does

This platform lets users select from a curated library of songs optimized for demon hunting. Users sing along with timed lyrics for easier tracking, then receive AI-powered feedback on their vocal performance via AWS Bedrock.

How we built it

We developed the frontend with React and Tailwind—sticking with what works. The backend runs on Python (Flask) due to its robust audio processing libraries, which handle both user recordings and target songs. We integrated Amazon Bedrock for streamlined AI implementation, giving us access to multiple models with flexible interaction controls.

Features

  • 🎵 Curated song selection for maximum demon-hunting effectiveness
  • 📝 Timed lyrics for easy follow-along
  • 🤖 AI-powered vocal analysis using AWS Bedrock
  • 📊 Real-time feedback on your singing performance

Tech Stack

  • Frontend: React, Tailwind CSS
  • Backend: Python (Flask)
  • AI: Amazon Bedrock, ElevenLabs
  • Audio Processing: librosa

Analysis Pipeline

  1. Audio Upload & Validation
  • Accepts audio files in multiple formats (WAV, MP3, OGG, FLAC, M4A, WEBM)
  • Validates file size (max 50MB), duration (1s - 5min), and audio quality
  • Checks for silence, corrupted data, and proper sample rates
  1. Pitch Extraction
  • Uses librosa's YIN algorithm for accurate monophonic pitch detection
  • Downsamples audio to 16kHz for faster processing
  • Extracts fundamental frequency (f0) for both reference and user recordings
  • Range: C2 (~65 Hz) to C7 (~2093 Hz) covering typical vocal ranges
  1. Pitch Comparison
  • Time-aligns user recording with reference track
  • Compares pitch frame-by-frame at corresponding moments
  • Calculates differences in cents (musical units where 100 cents = 1 semitone)
  • Filters out silent/unvoiced sections for accurate analysis
  1. Scoring Metrics
  • Within 25 cents: Very accurate singing (professional level)
  • Within 50 cents: Good pitch accuracy (generally considered "in tune")
  • Within 100 cents: Fair accuracy (audibly off-pitch but recognizable)
  • Provides average, median, and maximum pitch errors
  • Overall score (1-4): Based on percentage of notes within 50 cents
  1. AI Feedback Generation
    • Sends pitch analysis data to AWS Bedrock for personalized feedback
    • Uses LLM to generate constructive commentary on performance
    • Converts feedback to speech using ElevenLabs text-to-speech API
    • Returns both text and audio feedback to the user

What's next for Demon Hunterz

The roadmap includes several exciting features: integrating a music API for unlimited song selection, enhanced gamification with score tracking and rewards systems, user accounts for progress saving, and a friends feature for score comparison and friendly competition.

Built With

Share this project:

Updates