DartBeats - Project Story

Inspiration

DartBeats originated from a gap in the current AI music tooling landscape.

One of our teammates, Jason, produces music and found that most AI music products operate as end-to-end generative systems, outputting fully rendered audio with minimal post-hoc control. This paradigm conflicts with how producers actually work: music production is inherently iterative, non-linear, and granular.

We identified a missing abstraction layer:

AI should not be a black-box generator—it should be a first-class operator within the DAW pipeline.

DartBeats was built to embed AI directly into the production graph, enabling fine-grained manipulation rather than full-track replacement.


What it does

DartBeats is a unified AI-native DAW that co-locates audio processing, MIDI editing, and AI-driven transformations within a single execution environment.

Core capabilities include:

  • Audio clip manipulation on a timeline-based sequencer
  • MIDI note editing via a piano roll interface
  • Deterministic transformations:
    • Quantization
    • Velocity scaling
    • Temporal shifting
    • Pitch transposition
  • Natural language → tool execution via an AI agent layer

Instead of manual operations, users can issue high-level commands:

“Quantize drums to 1/16 and boost velocity by 20%”
“Transpose melody up a perfect fifth”

These are compiled into structured operations on underlying representations.

Internally, DartBeats models music as:

  • MIDI: discrete event sequences ( {(t_i, p_i, v_i)} )
  • Audio: sampled signals ( x(t) )
  • Transformations: composable operators ( T )

Such that: $$ x_{\text{out}}(t) = T_n \circ T_{n-1} \circ \dots \circ T_1 (x_{\text{in}}(t)) $$

Additionally, DartBeats integrates:

  • External generative APIs (Beatoven, Mubert)
  • Cloud persistence
  • Full project state serialization

This creates a closed-loop system from ideation → generation → editing → storage.


How we built it

We designed DartBeats as a vertically integrated system, rather than loosely coupled services.

Frontend

  • Next.js + React + TypeScript
  • Dual-pane architecture:
    • Timeline (audio domain)
    • Piano roll (symbolic/MIDI domain)
  • Tailwind CSS + shadcn/ui for composable UI primitives
  • Web Audio API for low-latency playback and scheduling

Backend

  • FastAPI (Python) as the orchestration layer
  • DSP + analysis stack:
    • librosa (feature extraction, slicing)
    • pedalboard (effects chain)
    • soundfile (I/O)
  • Supabase:
    • Auth (JWT-based)
    • Object storage (audio/MIDI assets)
    • Relational persistence (project state)

AI Layer

  • Vercel AI SDK + OpenRouter
  • Tool-calling architecture:
    • LLM → structured tool invocation → deterministic execution
  • Function registry for operations:

    • quantize()
    • transpose()
    • adjust_velocity()
    • etc.
  • Integrated generative providers:

    • Beatoven
    • Mubert

Key principle:

AI is not in the loop—it is the control plane.


Challenges we ran into

Teaching AI music

LLMs are not inherently aware of DAW semantics.

We had to:

  • Define a formal tool schema
  • Constrain outputs into valid operations
  • Map natural language → structured transformations

This effectively required building a domain-specific interface layer between language and music operations.

Designing a DAW from scratch

Modern DAWs like Ableton Live and FL Studio are deeply optimized systems with decades of iteration.

We had to:

  • Identify minimal viable abstractions
  • Reconstruct core primitives (tracks, clips, transport)
  • Avoid overbuilding while preserving usability

UI/UX balance

We needed to maintain:

  • High information density (for power users)
  • Low cognitive overhead (for usability)
  • Seamless AI integration

Embedding AI without disrupting workflow required careful interface design.

Systems complexity

  • Synchronizing multi-domain data (audio + MIDI + AI ops)
  • Managing playback scheduling under browser constraints
  • Ensuring consistency across persisted and in-memory state

Accomplishments we’re proud of

  • Building a functional AI-powered DAW from scratch
    We implemented a multi-domain system that unifies audio DSP, symbolic MIDI editing, and AI-driven transformations. Achieving coherent interaction between these layers required designing our own internal abstractions for timing, state, and transformations.

  • Creating something we would genuinely use ourselves
    We optimized for real workflows rather than demo features. This led to a system that supports iterative editing, rapid prototyping, and meaningful interaction with musical structure.

  • Successfully collaborating as a team for the first time
    We operated in a distributed development model using Git, managing concurrent feature development, resolving merge conflicts, and aligning on shared architecture under tight time constraints.

  • Pushing ourselves as non-CS majors
    We rapidly onboarded new technologies across the stack (frontend frameworks, backend APIs, AI tooling, cloud infra) and applied them in a cohesive system, demonstrating strong adaptability and systems-level thinking.

For Michael, this was his first collaborative coding project, making the transition from individual work to coordinated system-building particularly impactful.


What we learned

Product & Design

  • Audio and MIDI should exist within a shared interaction model
    Decoupling them introduces unnecessary friction; unification enables fluid workflows.

  • AI is most effective when bound to deterministic operators
    Constraining AI to executable actions improves reliability and user trust.

  • UX is a first-order concern
    System power is irrelevant if not accessible through intuitive interfaces.

Engineering

  • A unified timeline abstraction is critical: $$ t_{\text{global}} = t_{\text{audio}} = t_{\text{MIDI}} $$ This ensures consistency across all domains and prevents desynchronization.

  • Persistence is core infrastructure
    Without reliable state storage, the system cannot support real workflows.

  • Browser audio introduces hard constraints
    Latency, scheduling, and user gesture requirements must be explicitly handled.

Systems & Dev

  • Supabase provides strong primitives but requires careful configuration
    RLS policies, JWT validation, and storage rules are non-trivial to implement correctly.

  • Developer experience matters
    Clean environment configs and reproducible setups reduce onboarding friction.

  • Decoupling processing from persistence improves iteration speed
    Lightweight execution paths enable faster testing and debugging.

Teamwork

  • Version control is foundational for parallel development
  • Clear ownership and delegation accelerate progress
  • Collaborative system design requires constant alignment and communication

What’s next for DartBeats

We plan to evolve DartBeats from a prototype into a production-grade system.

Key directions:

  • Expand DAW feature set (automation, effects chains, advanced sequencing)
  • Improve AI reliability via tighter tool constraints and better prompt engineering
  • Validate with real users (producers) and iterate based on feedback
  • Explore commercialization and accelerator pathways

Built With

Share this project:

Updates