Inspiration

I'm often using the terminal. From ls to nvim and everything in between. But thats not all I'm using; I need to use communication apps (think Slack, Teams, Zoom, email), browser(s), music players etc.

Now, listening to the music is something I do often. When I'm working on something, I might play it from either phone or laptop. But, don't like the song? Want another playlist? Replay favorite song? Press skip, double tap the earphone. Look away, break the flow.

Now, will a terminal music player help me lock-in? I hope! So I built one. With AI DJ, because why not.

What it does

Well, the simplest way to describe it is: it plays music! Just execute uvx sigplay in the terminal and check it out.

There are two main views: Default and Floppy mix.

Default view is where you see your library, what song is currently playing and some visualizations and metadata. The visualizations are really cool. I love those!

Floppy mix is a bit different - it lets you select the tracks from your library and tell agent how you want to mix them together. Add crossplay (fade out / fade in so the music is not cut-out when switching songs), boost bass, lower bass, add some reverb and more!

The agent then takes your instructions (don't worry there are presets), figures out what tools to call, and creates a mix you can save or discard.

Want to try it? Just execute:

uvx sigplay

That's it, that's all. Just make sure you have uv installed.

How I built it

There are two modes in Kiro: spec mode and vibe mode. Spec mode was very helpful when I did not know from where to start. I didn't have any previous experience working with audio and I learned a lot while working on SIGPLAY.

I used spec mode to map out major features. I'd say it was crucial for creating this app this quickly. Define requirements to make sure you're going to build what you want. Then design - define architecture, components structure, mermaid data flow diagram, application structure.

Then when I had requirements (what), design (how), Kiro proceeds with clear and precise implementation details in the form of tasks. If this structure helped me understand, then it makes sense why LLMs (who are already very capable) perform a lot better with the spec context.

On the other side, vibe mode helps in other scenarios - think bug fixes, writing tests, docs etc. I also used it to build stuff on top of the spec implementation.

So, with spec mode, LLM had additional context from the requirements, design and tasks files. What about vibe mode now? That's where I used steering files (which were also consumed by the spec mode LLM executions. Win-win!). I had a couple of them:

  • tech.md - Textual patterns (reactive variables, call_from_thread etc.), Strands Agents integration, Pedalboard audio processing
  • structure.md - General structure of the project, where what goes
  • product.md - UX, keybindings, color palette
  • uv-steering.md - Letting LLM know I'm using uv package manager

Those steering files provided additional context to the LLM about the project.

One more thing that was helpful were MCP servers. I used 3:

  • Context7
  • strands-agents
  • awslabs.aws-documentation-mcp-server

As LLMs have different knowledge cutoff dates, these MCP servers were providing up-to-date info about tools and libraries I was using for SIGPLAY.

Challenges I ran into

  • AI mix. Choosing the right tools. Why is there a crackling noise? Why is the audio distorted? It's too loud! What guardrails to put? Did I just waste 80K tokens on this? Optimize this, optimize that. LOL, it was something to build this app. Funny, but again, I had little to no experience working with audio, but wanted to try at least. Needless to say, Kiro did some heavy-lifting here. And did a great job.
  • Header, footer, spacing, colors, UI/UX in general. So, how do you design this app? I wanted it to look good. I wanted controls to be simple. That's why: f for Floppy mix, d for Default view, + for Volume up, - for Volume down etc. But still, getting the padding just right, having info you need on the app, where to put a button, all while working with the TUI - harder than it sounds.
  • Keeping the steering files up-to-date. Remember that I had steering files that described project, its structure, how what and where. That changed with new tasks, fixes, refactoring etc. Now, for example: I removed one folder and changed one library for another. I open vibe mode, ask for something to be implemented. If I didn't update the steering files, LLM will have the old context, but will be met with the different code when it actually tries to do the update. Not great, high chance of worse performance. That's where hooks helped. Create a hook, explain an intent and thats it. I just need to run in periodically, and my steering files are up-to-date with the codebase.
  • Unused code, unhandled errors, simplification in order? That is expected. But, as your codebase grows it becomes harder to spot those things. Are you thinking the same as I did? Why not have a hook to analyze the codebase and suggest improvements? That's exactly what I did! Really helpful one
  • Sometimes, while doing an update, Kiro's agent runs an error check. But not always. That's where I had my third hook do the thing - make sure file is error-checked after it's saved.

Accomplishments that I'm proud of

  • uvx sigplay - I love this. Simple. All that is needed to run the app. Awesome
  • AI mix agent. BPM detection, time-stretching, crossfades, reverb, bass, treble, etc. All working well
  • Mixing presets
  • Clean architecture

What I learned

  • I learned a lot about audio processing! Also, my first time using Textual framework. It is very flexible and enjoyable to work with.
  • specs > vibe coding for complex stuff. A bit of preparation through requirements and design documents pays off. It really does.
  • Steering docs pay dividends. Create them. Make Kiro create them. With steering files, Kiro's output quality improves noticeably.
  • Hooks need to be fast or manual. Think about what you want that hook to do. Updating steering files on every file save? It'd take much time. Manual trigger fits better, but .py files error checks? That makes sense to be run on a file saved trigger. It's quick, and provides an instant feedback

What's next for SIGPLAY

I'm planning to add more storage scanning options. Ability to save and share mixing instructions, playlist creation, queues. I also experimented with lyrics view, but that one needs a bit more planning.

Built With

  • kiro
  • librosa
  • pedalboard
  • python
  • strands-agents
  • textual
  • uv
Share this project:

Updates