Inspiration

Right now people who make music hit a strange wall. If someone wants to play a song on piano guitar or any other instrument they usually want music they already love. But getting proper notes for that song is a pain. The stuff on the internet is either locked behind paywalls or the versions are too advanced or too basic. For example imagine someone who just started learning piano and they search for easy notes for a famous song. They keep clicking links and videos but nothing fits them. Too hard too expensive or just missing. After a while they feel stuck and they slowly stop trying. We have seen that happen again and again and it is honestly sad.

Now add on top of that the second problem with music files. There is something called sheet music which is like a picture that shows all the notes. There is also something called MIDI which is a computer file that holds the notes like a recipe for a song. Both are super useful but super annoying to find and super annoying to fix. If these files are messy or wrong there is not much you can do unless you know complicated music software. Most beginners have no chance so they just give up which kills the fun of learning music.

We looked at all this and said this feels old and broken. People should be able to look up a song they love and actually play it without crying inside. So we decided to build something fresh. A system that can go on the internet find music files clean them make them playable adjust them for different skill levels and let normal people change little things inside their browser and hear the result right away. No scary software no paywall traps no weird setup. Just search choose fix listen play. That is the spark behind this project. We wanted learning music to feel exciting again instead of frustrating and confusing.

What it does

This platform is a full-stack AI music arranger and sheet-music editor. A user can search for songs, and the system will find candidate sheet-music/MIDI sources from the web. Selenium + Chrome scrape them, convert them using MuseScore CLI (mscore), parse MIDI using mido, quantize using numpy, and turn everything into a structured JSON event format. Machine Learning models (Magenta client-side models, and TensorFlow/PyTorch server-side critics) filter noisy inputs, normalize tempo/key, and generate better arrangements. Users see rendered sheet music in the browser via opensheetmusicdisplay, can click measures, ask for revisions powered by LLMs (Gemini/OpenAI) and Magenta models, and then preview playback using Tone.js + SoundFontPlayer. Everything (MIDI, JSON, MusicXML, metadata, critic scores) saves to Firestore and large assets go to S3/GCS. Elasticsearch indexes MusicXML tokens, titles, composers, and tags for fast search.

How we built it

On the frontend, we used Next.js (React 18) for UI and API routes, OSMD for MusicXML rendering, Tone.js for low-latency audio playback, @magenta/music for SoundFontPlayer and models (MusicRNN, PerformanceRNN, MusicVAE), and CSS modules for styling. On the backend, we used Node.js (Next.js API routes), Firebase Admin SDK for secure Firestore writes and custom claims, Firebase client SDK for auth and light reads, and Elasticsearch (@elastic/elasticsearch) for full-text + structured search. For scraping and pipelines, we used Python 3.10+, Flask, Selenium + headless Chrome, mido for MIDI parsing/generation, MuseScore CLI for MusicXML⇄MIDI conversions, numpy for timing and quantization, and LLMs (Gemini/OpenAI) for conductor-style logic and source vetting. For storage & hosting, we used Firestore for metadata, S3/GCS for large assets, and Elastic Cloud/self-hosted Elastic for search.

Challenges that we ran into

Scraping music files from random websites was messy, so Selenium logic had to detect .mid, .midi, .xml, .musicxml links reliably. MIDI files needed heavy cleaning, so quantization and tempo normalization with numpy required tuning. The ML critic loop was tricky because we had to define measurable quality signals (rhythm density, voice clarity, melodic similarity, human-likeness). Browser playback was also hard because Tone.js, SoundFontPlayer, and Magenta models had to sync timings and avoid audio delay. Secure Firestore access needed Firebase Admin claims and ADMIN_KEY fallback logic. Elasticsearch indexing MusicXML tokens also needed custom mapping.

Accomplishments that we're proud of

We built an end-to-end pipeline where scraping → MIDI/MusicXML conversion → quantization → AI arrangement → critic scoring → OSMD rendering → user measure editing → AI revisions → playback all worked together. Sheet music editing in the browser with measure-selection + AI revision is a rare feature. The ML critic + RL-like improvement loop improved arrangements automatically. Also, Firestore + Elasticsearch made revisions searchable, and the playback felt realistic thanks to Tone.js + SoundFontPlayer + Magenta models.

What we learned

We learned how messy real music data is, and how useful quantization + normalization pipelines are. We learned that LLMs are good at high-level reasoning but not at low-level MIDI fixes, so Magenta + mido were more reliable for structure. Elasticsearch was great for token search across MusicXML. Firebase custom claims made secure roles simple. And containerizing scraping + ML pipelines with Docker saved a lot of debugging time.

What’s next for Nikhertz

Future steps include GPU-backed arrangement servers for faster high-quality generations, more ML critic metrics (like harmonic tension), user-shared arrangement libraries, better human-in-loop editors, fully cloud-hosted pipelines, DMCA/takedown workflows, and production-grade observability using Elastic + OpenTelemetry. We may also support more instrument sets, score-style presets, API access for educational tools, and better dataset imports.

Built With

Share this project:

Updates