Devpost Submission — Slotify

🧐 Inspiration

Creators don’t have time to manually edit sponsor reads. Sponsorships are one of the biggest ways podcasts make money, but manual ad insertion is time-consuming and doesn’t scale. They have to guess the best moment, record the read, cut the audio, and stitch everything together without breaking the flow. Slotify automates this: it finds the best insertion points, generates a natural sponsor read, and exports a seamless final episode in one click.

🤯 What it does

Slotify is an AI-powered insertion editor that helps podcasts and creators add ads seamlessly in one click.

Users can:

Generate a voice-consistent sponsor read
Get the top 3 best insertion moments with pros/cons (based on natural pauses, topic transitions, and pacing)
Preview each option (3 seconds before → sponsor → 3 seconds after)
Choose the sponsor tone and language
Export a final MP3 with the ad blended in naturally

It turns ad monetization from tedious manual editing into a fast, retention-safe workflow.

👷‍♂️ How we built it

Frontend:

React + TypeScript
Vite for development and builds
ElevenLabs JS SDK for sponsor voice generation

Backend:

Node.js + Express API
Multer for audio uploads
OpenAI SDK for ad copy + slot decisioning
FFmpeg + ffprobe for audio slicing, stitching, and final MP3 renders
Python audio pipeline with:
- pydub for audio processing
- librosa + numpy for beat/energy analysis
- pyloudnorm for loudness matching
- soundfile for waveform I/O
Optional Whisper for transcript-based slot selection

ElevenLabs:

Used ElevenLabs as a voice-consistent sponsor production layer, not just basic TTS
Generated sponsor reads in the creator’s authorized voice to preserve host identity and avoid the “new voice” interruption
Supported style-controlled delivery (calm / energetic / premium) to match the episode vibe
Designed for scalable variants like multi-tone versions, multi-language reads, and music-aware inserts (auto-ducking + beat-safe transitions)
Integrated directly into our pipeline so creators can generate → preview → export seamlessly
Added ethical guardrails with voice-rights certification

🤦‍♂️ Challenges we ran into

Getting truly seamless stitching
Keeping consistent loudness and pacing across different audio sources
Making previews instant and demo-friendly without generating full exports

😊 Accomplishments that we’re proud of

A full end-to-end pipeline from upload to final export
Top-3 recommended insertion slots with instant preview playback
Seamless FFmpeg trimming + stitching for clean audio transitions
A polished B2B dashboard with a fast, guided workflow
Built-in voice rights checks for ethical sponsor generation

🧠 What we learned

How to design an end-to-end audio workflow that stays fast and reliable: analyze → generate sponsor → preview → export
How to use FFmpeg trim + concat correctly (timing, formats, loudness) to avoid “audio restart” and mismatched transitions
How preview snippets (3 seconds before/after) make heavy processing feel instant and demo-friendly
Why voice products need trust features like consent checks and consistent output quality

🤑 What’s next for Slotify

Smarter slot selection using transcript-based topic boundaries and pacing signals
Even smoother blending with ducking + crossfades
A scalable Slotify API for creators, networks, and platforms