Homepage
Guitar Diagram + Lyrics LRC
Beat Chord Grid
Music AI transcription

ChordMiniApp Project Story

About the Project

ChordMiniApp is an advanced AI-powered music analysis platform that transforms how musicians, researchers, and music enthusiasts interact with musical content.

What Inspired This Project

The inspiration came from a fundamental challenge in music education and research: the lack of accessible, accurate tools for real-time chord recognition and beat analysis. Traditional music analysis requires years of training, while existing automated tools often lack the precision needed for educational or research purposes.

I recognized that while generative AI models were gaining attention, transcriptive models remained essential foundational components for:

Language model training requiring structured musical representations
Educational infrastructure for music learning
Research datasets for advanced generative models
Interpretable musical analysis that musicians can understand and verify

What I Learned

This project became a masterclass in full-stack ML engineering and music information retrieval:

Technical Challenges Overcome:

Multi-model orchestration: Integrating Beat-Transformer, Chord-CNN-LSTM, and BTC models with intelligent fallback strategies
Audio processing at scale: Handling file size limits, format conversions, and real-time processing constraints
Synchronization complexity: Aligning beats, chords, and lyrics to a master timeline with millisecond precision
Cross-platform deployment: Ensuring models work across local development (GPU-accelerated) and cloud production (CPU-optimized)

Music Theory Integration:

Implementing Roman numeral analysis with proper enharmonic handling
Synchronizing word-level lyrics timing with beat-aligned timestamps
Managing chord inversions and slash chord notation for guitar diagrams

How I Built This Project

Architecture Philosophy: Reliability through layered fallbacks

The system employs a hybrid architecture combining modern web technologies with specialized ML backends:

Backend Architecture (Python Flask + ML Models):

Service-oriented design with clear separation between orchestration and detection
Intelligent model selection based on file size, availability, and performance requirements
Robust error handling with graceful degradation

Beat-Aligned Master Timeline: All components (chords, lyrics, markers) synchronize to beat timestamps
Multi-Environment Audio Extraction:
- Development: yt-dlp (localhost:5001, avoiding macOS AirPlay conflicts)
- Production: yt-mp3-go service with QuickTube fallback
- File upload: Vercel Blob storage with direct processing
Progressive Enhancement: Core functionality works without JavaScript, with enhanced features layered on top

Challenges I Faced

1. Model Integration Complexity The biggest challenge was orchestrating multiple ML models with different requirements:

Beat-Transformer: GPU-accelerated, 100MB file limit
Chord-CNN-LSTM: CPU-friendly, ensemble of 5 models
BTC variants: High accuracy, memory-intensive

Solution: Implemented intelligent detector selection with automatic fallbacks based on file size and system capabilities.

2. Audio Processing Pipeline Managing audio extraction across different environments while maintaining reliability:

graph LR
    M{Environment Detection} -->|Development| N[yt-dlp Service]
    M -->|Production| O[yt-mp3-go Service]
    N --> P[Audio URL Generation]
    O --> P

3. Real-time Synchronization Achieving millisecond-precise alignment between beats, chords, and lyrics required developing a master timeline system where all components snap to beat-aligned timestamps.

4. Cross-platform Deployment Ensuring models work across:

Local development (macOS with GPU acceleration)
Google Cloud Run (containerized CPU deployment)
Docker environments (multi-architecture support)

5. Performance Optimization Balancing accuracy with speed through:

Lazy model loading and memory optimization
Intelligent caching strategies (Firebase + Vercel Blob)
Progressive loading with real-time feedback

Built With

Frontend Technologies

Next.js 15.3.1 - React framework with App Router
React 19 - Latest UI library features
TypeScript - Type-safe development
Tailwind CSS - Utility-first styling
Framer Motion - Smooth animations
HeroUI - Modern component library

Backend & ML Stack

Python 3.9+ Flask - Lightweight backend framework
PyTorch 2.6.0 - ML model inference
TensorFlow 2.15.1 - Spleeter audio separation
librosa 0.10.1 - Audio feature extraction
madmom 0.16.1 - Beat detection algorithms

Machine Learning Models

Beat-Transformer - State-of-the-art beat detection
Chord-CNN-LSTM - Ensemble chord recognition
BTC (SL/PL) - Advanced chord analysis variants
Spleeter - 5-stem audio separation

Cloud Services & APIs

Google Cloud Run - Serverless ML backend deployment
Firebase Firestore - NoSQL caching database
Vercel Blob - File storage for audio processing
Google Gemini API - AI translations and key detection
Music.ai SDK - Lyrics transcription
YouTube Search API - Video discovery

Audio Processing

yt-dlp - YouTube audio extraction (development)
yt-mp3-go - Production audio extraction service
QuickTube - Fallback extraction service
FFmpeg - Audio format conversion

Development Tools

Jest + Testing Library - Comprehensive testing suite
ESLint + TypeScript - Code quality and type safety
Docker - Containerized deployment
GitHub Actions - CI/CD pipeline

Try It Out

Live Demo

🌐 chordmini.me - Full production deployment

Source Code

📂 GitHub Repository - Complete source with documentation

Research Applications

This platform serves as infrastructure for music information retrieval research, providing:

High-quality transcriptive datasets for training generative models
Benchmarking tools for chord recognition algorithms
Educational resources for music theory and analysis

ChordMiniApp makes advanced music analysis accessible to everyone from students to researchers.

Built With

firebase
flask
gemini
googlecloudrun
heroui
librosa
madmom
music.ai
nextjs
python
pytorch
react19
tailwindcss
typescript

Updates

Nghia T Phan started this project — Sep 06, 2025 04:22 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.