OpenBeats

Generative beats with OpenUI and Tone. Daniel Bank


Same trick, twice

OpenBeats — an LLM composes control surfaces (OpenUI / OpenUI Lang) over a react-music-style audio engine (Tone.js). Same reconciler story twice: once for sound, once for widgets the model invents.


Two layers, one store

Two decoupled layers, one shared store:

  1. Audio engine (src/audio/) — JSX describes the graph (<Song>, <Sequencer>, <Track>, <Synth>, <Sampler>). Components wire Web Audio / Tone; they are not DOM layout.
  2. Generative UI — a library of controls the model emits as streamed OpenUI Lang; <Renderer> turns spec into live knobs, grids, and transport.

The store sits between them. Neither layer imports the other.


Motivation

OpenUI lets us generate React UI from natural language — fast, and cheap. So here's the fun question: React was never really about the DOM. What if we pointed all that generative power at experiences that aren't UI at all?

We already know React can do this. Ken Wheeler and react-music (Formidable) showed the core idea: use React's component model not to paint pixels, but to declaratively describe audio. JSX like <Song>, <Sequencer>, <Synth>, <Note> wires an audio graph (Web Audio / Tone) that actually plays. React is a reconciler for a tree of things — and those things do not have to be HTML.


Prior art

Ken Wheeler made this concrete in several talks:

Talk Why it matters here
Making Beats with React Live beatmaking; hooks + Tone.js drum machine in the browser.
Mixed Mode React (React Amsterdam '18) Components can target non-DOM APIs (audio, hardware) and mix into normal apps.
react-amsterdam-demos The on-stage source for the experiments above.

React is renderer-agnostic

It is the same insight behind React Native, react-three-fiber, and Ink (React for terminals): a stable way to manage a tree. The host can be the screen, the GPU, the CLI, or air pressure over a speaker.

OpenBeats asks: if the audio tree is already "non-DOM React," what if the control panel is also generated UI — so the model and the human share one language (components), split across sound and controls?


Sources


Architecture: audio

Components own Tone nodes via effects/refs; many render no DOM. The UI reads step position and patterns from the same model the engine plays.

<Song tempo={120}>
  <Sequencer resolution={16} bars={1}>
    <Track name="Kick" pattern={[0, 4, 8, 12]}>
      <Synth type="kick" />
    </Track>
    <Track name="Snare" pattern={[4, 12]}>
      <Synth type="snare" />
    </Track>
    <Track name="Hat" pattern={[0, 2, 4, 6, 8, 10, 12, 14]}>
      <Synth type="hat" />
    </Track>
  </Sequencer>
</Song>

Architecture: audio — the components

  • <Song tempo>Tone.Transport, AudioContext unlock (user gesture).
  • <Sequencer> — one sequence, fan-out (time, step) to tracks.
  • <Track> — fires children on pattern steps; velocity, samples, etc.
  • <Synth type> — synthesized drums; <Sampler url>Tone.Player when sampleUrl is set.

TypeScript throughout; no any on public engine/store APIs.


Built on OpenUI

The generative layer is OpenUI all the way down — the actual APIs we import:

  • @openuidev/react-langdefineComponent (register what the model may emit), createLibrary (assemble the set), <Renderer> (stream OpenUI Lang → live React), useIsStreaming.
  • @openuidev/react-ui (/genui-lib) — openuiLibrary + openuiPromptOptions: a prebuilt control library and prompt config.
  • @openuidev/react-headless — the headless primitives react-ui is built on.
  • @openuidev/lang-corePromptOptions and OpenUI Lang core types.

Our own controls (Rack, StepGrid, Knob, …) plug into this pipeline — they ride OpenUI's renderer, not a bespoke one.


Architecture: generative UI

Generated controls use the same store as the hand-built sequencer: transport, patterns, and parameters (e.g. filters) all drive the live engine. ⚡ demo layout (offline) exercises the same renderer path without calling the model.


Demo script

  1. Open the app, click Play (required browser audio gesture).
  2. Show the step grid + transport — human-edited pattern drives the same store the engine hears.
  3. Prompt: describe the panel you want or tap an example; watch OpenUI Lang stream into live controls.
  4. Toggle offline demo if keys/quota are unavailable — same pipeline, canned spec.
  5. Optional: Harmonic Scope — offline-rendered loop, Fourier partials, honest Gibbs ringing on drum mixes.

Stack

  • Next.js 16, React 19, Tailwind 4, TypeScript.
  • OpenUI (@openuidev/*) — OpenUI Lang, headless primitives, renderer; bring your own model key.
  • Tone.js for audio.
  • Zustand for the shared store.
  • Model: OpenAI-compatible SDK in src/app/api/generate/route.ts; optional OPENAI_BASE_URL for proxies or other OpenAI-compatible endpoints.

Setup (for judges / cloning)

cp .env.example .env   # add OPENAI_API_KEY; optional OPENAI_BASE_URL, OPENAI_MODEL
npm install
npm run dev

Open http://localhost:3000.

.env is gitignored — never commit secrets.


Appendix: Troubleshooting

“AudioContext was not allowed to start” / silence
Playback must start from a real user click (e.g. Play). The engine calls Tone.start() from that path (src/audio/Song.tsx).

param must be an AudioParam / SSR
Never construct Tone nodes in render or module scope; use useEffect on the client.

Generation 429 / quota
Use ⚡ demo layout (offline) or fix billing / key / proxy. Set OPENAI_MODEL in .env (default gpt-5.2).

Built With

  • openui
Share this project:

Updates