Inspiration

My inspiration comes from my actual experience as a piano teacher. I often want my students to practice improvisation to understand music theory deeply, rather than just reading notes.

However, I encountered a paradox:

"Traditional textbooks have fixed scores. Once a student practices them enough to memorize the pattern, the 'improvisation' becomes 'recitation'."

The essence of improvisation is reacting to the unknown. I realized that to truly teach improvisation, I needed an infinite source of new, structured, yet unpredictable musical exercises. That's when I thought: Why not use AI to generate fresh staff notation on the fly?

By combining Generative AI with music theory, we can create a "sight-reading engine" that never runs out of material, keeping the creative spark alive.

What it does

AIMPRO is an AI-powered music education platform that generates unique piano exercises and renders them into playable sheet music instantly.

  1. Text-to-Music Generation: Users can describe a specific musical goal (e.g., "A C Major chord progression with a jazz rhythm").
  2. Instant Rendering: The AI outputs ABC notation, which is immediately compiled into visual sheet music (Staff Notation) in the browser.
  3. Audio Playback: High-quality SoundFonts play the generated music, allowing students to hear the goal before playing.

How we built it

We built a modern web application leveraging Google's AI capabilities:

  • AI Model: We utilized Google Gemini 3 Flash (via geminiABC.js and django backend) as our core composition engine. We engineered prompts to output strict ABC notation syntax.
  • Frontend: Built with Vanilla JavaScript and modern CSS (Tailwind-style utility classes) for a lightweight, fast user experience.
  • Music Rendering: We integrated abcjs to parse the AI's text output and render it as standard SVG sheet music.
  • Audio Synthesis: We used Web Audio API with SoundFonts (located in statics/soundfont/MusyngKite) to provide realistic instrument sounds, far superior to standard MIDI beeps.

Challenges we ran into

1. Prompt Engineering for Strict Musical Syntax

Getting an LLM to "be creative" is easy; getting it to "be creative within strict syntax rules" is a hard engineering problem.

  • The Challenge: Large Language Models are probabilistic. Early in development, the model would often "hallucinate" invalid ABC headers, mix natural language with code, or lose track of time signatures (e.g., generating 5 beats in a 4/4 bar).
  • The Solution: We moved beyond simple instructions to a Structured XML Prompting strategy. We designed a rigorous system prompt using tags like <role>, <critical_rules>, and <composition_logic> to enforce a "Strict Composer Mode." This approach explicitly defines:
    • Syntax Constraints: Mandating ABC v2.1 standards (e.g., V:1 for melody, V:2 for accompaniment, strictly aligned measures).
    • Output Purity: Enforcing a "Code-Only" policy (no Markdown, no natural language) via <output_constraint> tags.
    • Logic Chain: Guiding the AI through specific composition steps, from style analysis to MIDI instrument assignment (%%MIDI program), ensuring the output is not just musically valid but also machine-parsable.

2. Strict ABC Validation & Formatting

Raw ABC notation generated by LLMs often contains subtle timing errors or syntax ambiguities that cause abcjs (the rendering library) to fail or render incorrectly.

  • The Solution: We built a robust ABC Parser middleware (abcParser.js) that sanitizes the output before rendering:
    • Beat Alignment Engine: A custom fixAbcAlignment algorithm calculates the exact duration of every measure. If a measure is mathematically incomplete (e.g., missing a 1/8th note in 4/4 time), it intelligently pads it with invisible rests to prevent layout shifts.
    • Tuplet Math: It handles complex tuplet logic (e.g., (3:2:3) to ensure note spacing is mathematically precise.
    • MIDI Channel Assignment: To support polyphony playback, we implemented assignMidiChannels to automatically route voices to distinct MIDI channels (skipping the percussion channel 10), ensuring rich, multi-layered audio.
    • Accidental Fix: The fixAbcAccidentals function explicitly marks accidentals for every affected note within a measure, solving edge cases where the audio engine might "forget" a key signature change.

What we learned

  • Music as Code: I learned the intricacies of ABC notation syntax. It bridges the gap between human-readable music and machine-parsable text.
  • Prompt Engineering for Structure: Getting an LLM to "be creative" is easy; getting it to "be creative within strict syntax rules" is a hard engineering problem.
  • Full-Stack Integration: Connecting a stateless AI API with a stateful, interactive music player taught me a lot about asynchronous JavaScript and state management.

Built With

Share this project:

Updates