About Vibe Tracker

What Inspired Me

I've always been drawn to music creation, particularly the raw, chaotic energy of chiptune and primitive electronic sounds. A few years ago, I was experimenting with MilkyTracker and learning DAW basics, but I kept hitting the same wall: by the time I figured out how to translate a musical idea into actual sound, the inspiration had already faded away.

This frustration stayed with me as life got busier and music creation took a backseat. I wanted something more immediate - a way to capture musical ideas before they slipped away, without getting bogged down in technical complexity.

When powerful AI models like GPT-OSS became more accessible with generous free tiers, I saw an opportunity. I'd been experimenting with these models for small tasks and wondered: could AI handle the creative challenge of generating tracker-style music from simple text descriptions?

What I Learned

Building Vibe Tracker taught me several key lessons:

Real-time audio programming is incredibly demanding - every millisecond matters when processing audio buffers
AI prompt engineering for structured output requires a careful balance between creativity and consistency
Vectorized audio processing can achieve 5-8x performance improvements over naive implementations
Multiple AI provider integration provides crucial reliability through automatic fallbacks
Terminal UIs can be surprisingly powerful for creative applications when designed thoughtfully

How I Built It

The project evolved through several key phases:

1. Core Audio Engine

Built a real-time audio sequencer using Python's sounddevice library
Implemented vectorized synthesis for sine, square, sawtooth, and triangle waveforms
Created an ADSR envelope system for natural-sounding notes

2. AI Integration

Developed a multi-provider system supporting both Hugging Face GPT-OSS and Google Gemini
Engineered prompts to generate structured JSON for instruments, patterns, and effects
Implemented automatic provider fallback for reliability

3. Effects System

Built instrument-level reverb effects with configurable parameters
Optimized processing to under 2ms per audio buffer for real-time performance
Integrated effects generation into the AI workflow

4. User Interface

Created a terminal-based interface using Textual framework
Implemented real-time pattern visualization and instrument management
Added keyboard shortcuts for common operations (play/pause, save, export)

Challenges I Faced

Performance Optimization

The biggest challenge was achieving real-time audio performance. Initial implementations had severe lag due to inefficient sample generation and debug logging in audio callbacks. I solved this through:

Replacing per-sample loops with numpy vectorized operations
Eliminating all logging from real-time audio paths
Implementing efficient memory management with buffer reuse

AI Consistency

Getting AI models to generate valid, musically coherent JSON was tricky. Different providers had varying strengths - GPT-OSS was lacking creative possibilities with an overloaded system prompt and vague queries. I addressed this with:

Carefully crafted system prompts with examples
Robust JSON parsing with error recovery
Provider-specific optimizations

Multi-track Synchronization

Handling multiple instruments playing simultaneously without timing issues required:

Precise event scheduling using frame-accurate timing
Careful management of note-on/off events to prevent accumulation
Thread-safe communication between UI and audio threads

Audio Quality

Achieving clean, professional-sounding output meant:

Implementing anti-aliasing for waveform generation using PolyBLEP
Adding proper filtering and effects processing
Balancing multiple audio sources without clipping

The result is a tool that solves my original problem: capturing musical inspiration immediately through natural language, without technical barriers getting in the way.

Built With

claude-4-sonnet
gemini-2.5-flash
git
google-gemini-ai
gpt-5
hugging-face-api
numpy
openai-sdk
openrouter
portaudio
python
python-dotenv
requests
scipy
sounddevice
textual
windsurf

Updates

Roman Ianvarev started this project — Sep 03, 2025 05:35 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.