Huephonic

Inspiration

We have always been fascinated by synesthesia—the phenomenon where experiencing one sense involuntarily triggers another. We wanted to create an experience that lets anyone "see" their music and instantly visualize the mood of their environment, and following that thought, we questioned how this could be applied to help deaf people. The project also focuses on lowering the barrier of entry into music creation, allowing for users to be able to describe a song concept and create it even without being able to hear. Then, the user can examine their own song snippets to learn about how the song structure works and recognize the tempo.

How we built it

Our project is a full-stack application built using React and TypeScript, focusing on two primary engines:

Audio Spectogram Visualizer: The flagship feature runs entirely on the client side using the Web Audio API. We process audio in real-time from three input modes: file uploads, speaker/tab capture, and microphone input. The engine extracts 25 distinct audio features, such as energy, brightness, tempo, Zero-Crossing Rate (ZCR), and Root Mean Square (RMS). These features are then graphed in different colors to signify the various parts of the audio.
AI-Driven Music Generation Pipeline: We built a multi-step automated backend pipeline. When a user describes a mood, we use Vercel AI SDK to generate a song blueprint and per-section plans. This blueprint is sent to the ElevenLabs API to synthesize the actual audio.
Backend & Storage: Once generated, the audio file is securely uploaded via UploadThing, and the track's metadata is persisted in a Neon PostgreSQL database managed by Drizzle ORM.

The challenges we faced

Real-Time Digital Signal Processing (DSP): Extracting 25 features concurrently at 60 FPS without dropping frames was computationally intensive. We had to highly optimize our Web Audio API audio worklets.
Pipeline Orchestration: Connecting multiple asynchronous services required robust error handling to ensure users wouldn't end up with dead audio files or broken database records.

What we learned

Advanced Audio Analysis: We learned how to deeply analyze frequency bins and time-domain data using the Web Audio API, gaining a practical understanding of concepts like spectral flatness, tonnetz, and dynamic range.
AI Tool Chaining: We discovered how to effectively chain different AI models together—using an LLM to "direct" an audio-generation model yields much more structured and coherent musical results than prompting the audio model directly.

Built With

canvas-api
drizzle-orm
elevenlabs
neon-postgresql
openai
playwright
react
scss-(openprops)
typescript
uploadthing
vercel
vercel-ai-sdk
web-audio-api
webspatial

Submitted to

Hack For Humanity 2026

Created by

I was the main web engineer/developer for this project. I worked on the front end and backend integration, attempting to use WebSpatial for our project, the database (schemas and integration into our project), as well as UploadThing (bucket management and integration) for file storage of user-created music.

Adarsh Kumar
Helped design the core project concept and developed backend features using Astro. I also managed the project presentation, translating our 25-feature audio analysis engine into a compelling narrative for the hackathon demo.

Alex Tran
I helped with front end UI and visuals, finding apis, bug testing, ideation and presentation.

Tristan Anastasiu
I did some color Ui and integration of the pipeline for 11labs -> upload thing-> our database.

Amrit Ladhar
Nina Ding

Updates

Adarsh Kumar started this project — Feb 28, 2026 11:19 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.