Devpost Submission — Slotify

🧐 Inspiration

Creators don’t have time to manually edit sponsor reads. Sponsorships are one of the biggest ways podcasts make money, but manual ad insertion is time-consuming and doesn’t scale. They have to guess the best moment, record the read, cut the audio, and stitch everything together without breaking the flow. Slotify automates this: it finds the best insertion points, generates a natural sponsor read, and exports a seamless final episode in one click.

🤯 What it does

Slotify is an AI-powered insertion editor that helps podcasts and creators add ads seamlessly in one click.

Users can:

  • Generate a voice-consistent sponsor read
  • Get the top 3 best insertion moments with pros/cons (based on natural pauses, topic transitions, and pacing)
  • Preview each option (3 seconds before → sponsor → 3 seconds after)
  • Choose the sponsor tone and language
  • Export a final MP3 with the ad blended in naturally

It turns ad monetization from tedious manual editing into a fast, retention-safe workflow.

👷‍♂️ How we built it

Frontend:

  • React + TypeScript
  • Vite for development and builds
  • ElevenLabs JS SDK for sponsor voice generation

Backend:

  • Node.js + Express API
  • Multer for audio uploads
  • OpenAI SDK for ad copy + slot decisioning
  • FFmpeg + ffprobe for audio slicing, stitching, and final MP3 renders
  • Python audio pipeline with:
    • pydub for audio processing
    • librosa + numpy for beat/energy analysis
    • pyloudnorm for loudness matching
    • soundfile for waveform I/O
  • Optional Whisper for transcript-based slot selection

ElevenLabs:

  • Used ElevenLabs as a voice-consistent sponsor production layer, not just basic TTS
  • Generated sponsor reads in the creator’s authorized voice to preserve host identity and avoid the “new voice” interruption
  • Supported style-controlled delivery (calm / energetic / premium) to match the episode vibe
  • Designed for scalable variants like multi-tone versions, multi-language reads, and music-aware inserts (auto-ducking + beat-safe transitions)
  • Integrated directly into our pipeline so creators can generate → preview → export seamlessly
  • Added ethical guardrails with voice-rights certification

🤦‍♂️ Challenges we ran into

  • Getting truly seamless stitching
  • Keeping consistent loudness and pacing across different audio sources
  • Making previews instant and demo-friendly without generating full exports

😊 Accomplishments that we’re proud of

  • A full end-to-end pipeline from upload to final export
  • Top-3 recommended insertion slots with instant preview playback
  • Seamless FFmpeg trimming + stitching for clean audio transitions
  • A polished B2B dashboard with a fast, guided workflow
  • Built-in voice rights checks for ethical sponsor generation

🧠 What we learned

  • How to design an end-to-end audio workflow that stays fast and reliable: analyze → generate sponsor → preview → export
  • How to use FFmpeg trim + concat correctly (timing, formats, loudness) to avoid “audio restart” and mismatched transitions
  • How preview snippets (3 seconds before/after) make heavy processing feel instant and demo-friendly
  • Why voice products need trust features like consent checks and consistent output quality

🤑 What’s next for Slotify

  • Smarter slot selection using transcript-based topic boundaries and pacing signals
  • Even smoother blending with ducking + crossfades
  • A scalable Slotify API for creators, networks, and platforms

Built With

Share this project:

Updates