ChordMiniApp Project Story
About the Project
ChordMiniApp is an advanced AI-powered music analysis platform that transforms how musicians, researchers, and music enthusiasts interact with musical content.
What Inspired This Project
The inspiration came from a fundamental challenge in music education and research: the lack of accessible, accurate tools for real-time chord recognition and beat analysis. Traditional music analysis requires years of training, while existing automated tools often lack the precision needed for educational or research purposes.
I recognized that while generative AI models were gaining attention, transcriptive models remained essential foundational components for:
- Language model training requiring structured musical representations
- Educational infrastructure for music learning
- Research datasets for advanced generative models
- Interpretable musical analysis that musicians can understand and verify
What I Learned
This project became a masterclass in full-stack ML engineering and music information retrieval:
Technical Challenges Overcome:
- Multi-model orchestration: Integrating Beat-Transformer, Chord-CNN-LSTM, and BTC models with intelligent fallback strategies
- Audio processing at scale: Handling file size limits, format conversions, and real-time processing constraints
- Synchronization complexity: Aligning beats, chords, and lyrics to a master timeline with millisecond precision
- Cross-platform deployment: Ensuring models work across local development (GPU-accelerated) and cloud production (CPU-optimized)
Music Theory Integration:
- Implementing Roman numeral analysis with proper enharmonic handling
- Synchronizing word-level lyrics timing with beat-aligned timestamps
- Managing chord inversions and slash chord notation for guitar diagrams
How I Built This Project
Architecture Philosophy: Reliability through layered fallbacks
The system employs a hybrid architecture combining modern web technologies with specialized ML backends:
Backend Architecture (Python Flask + ML Models):
- Service-oriented design with clear separation between orchestration and detection
- Intelligent model selection based on file size, availability, and performance requirements
- Robust error handling with graceful degradation
Beat-Aligned Master Timeline: All components (chords, lyrics, markers) synchronize to beat timestamps
Multi-Environment Audio Extraction:
- Development:
yt-dlp(localhost:5001, avoiding macOS AirPlay conflicts) - Production:
yt-mp3-goservice with QuickTube fallback - File upload: Vercel Blob storage with direct processing
- Development:
Progressive Enhancement: Core functionality works without JavaScript, with enhanced features layered on top
Challenges I Faced
1. Model Integration Complexity The biggest challenge was orchestrating multiple ML models with different requirements:
- Beat-Transformer: GPU-accelerated, 100MB file limit
- Chord-CNN-LSTM: CPU-friendly, ensemble of 5 models
- BTC variants: High accuracy, memory-intensive
Solution: Implemented intelligent detector selection with automatic fallbacks based on file size and system capabilities.
2. Audio Processing Pipeline Managing audio extraction across different environments while maintaining reliability:
graph LR
M{Environment Detection} -->|Development| N[yt-dlp Service]
M -->|Production| O[yt-mp3-go Service]
N --> P[Audio URL Generation]
O --> P
3. Real-time Synchronization Achieving millisecond-precise alignment between beats, chords, and lyrics required developing a master timeline system where all components snap to beat-aligned timestamps.
4. Cross-platform Deployment Ensuring models work across:
- Local development (macOS with GPU acceleration)
- Google Cloud Run (containerized CPU deployment)
- Docker environments (multi-architecture support)
5. Performance Optimization Balancing accuracy with speed through:
- Lazy model loading and memory optimization
- Intelligent caching strategies (Firebase + Vercel Blob)
- Progressive loading with real-time feedback
Built With
Frontend Technologies
- Next.js 15.3.1 - React framework with App Router
- React 19 - Latest UI library features
- TypeScript - Type-safe development
- Tailwind CSS - Utility-first styling
- Framer Motion - Smooth animations
- HeroUI - Modern component library
Backend & ML Stack
- Python 3.9+ Flask - Lightweight backend framework
- PyTorch 2.6.0 - ML model inference
- TensorFlow 2.15.1 - Spleeter audio separation
- librosa 0.10.1 - Audio feature extraction
- madmom 0.16.1 - Beat detection algorithms
Machine Learning Models
- Beat-Transformer - State-of-the-art beat detection
- Chord-CNN-LSTM - Ensemble chord recognition
- BTC (SL/PL) - Advanced chord analysis variants
- Spleeter - 5-stem audio separation
Cloud Services & APIs
- Google Cloud Run - Serverless ML backend deployment
- Firebase Firestore - NoSQL caching database
- Vercel Blob - File storage for audio processing
- Google Gemini API - AI translations and key detection
- Music.ai SDK - Lyrics transcription
- YouTube Search API - Video discovery
Audio Processing
- yt-dlp - YouTube audio extraction (development)
- yt-mp3-go - Production audio extraction service
- QuickTube - Fallback extraction service
- FFmpeg - Audio format conversion
Development Tools
- Jest + Testing Library - Comprehensive testing suite
- ESLint + TypeScript - Code quality and type safety
- Docker - Containerized deployment
- GitHub Actions - CI/CD pipeline
Try It Out
Live Demo
🌐 chordmini.me - Full production deployment
Source Code
📂 GitHub Repository - Complete source with documentation
Research Applications
This platform serves as infrastructure for music information retrieval research, providing:
- High-quality transcriptive datasets for training generative models
- Benchmarking tools for chord recognition algorithms
- Educational resources for music theory and analysis
ChordMiniApp makes advanced music analysis accessible to everyone from students to researchers.
Log in or sign up for Devpost to join the conversation.