HERO section
introduction page
dream team
setup page
processing page
podcast page

Let's Settle This Podcast

Turn ridiculous debates into hilarious AI-generated podcast episodes

Team Information

Team Name: Team Placeholders

Team Members:

David Liang (dvd99@seas.upenn.edu)
Michelle Ma (qianranm@seas.upenn.edu)
Rachel Wu (wurachel@seas.upenn.edu)

Project Description

Let's Settle This Podcast is a web application that transforms absurd topics into comedic "serious debate" podcast episodes. Users submit a ridiculous question (like "Is a hot dog a sandwich?" or "Toilet paper: over or under?"), select a host and four guests from iconic characters (SpongeBob, Trump, Bugs Bunny, Homer Simpson, and more), and generate a 60-90 second AI-powered audio episode complete with character avatars, subtitles, and waveform visualization.

The app creates multi-round debates where characters maintain context, build on each other's arguments, and deliver punchlines - all while taking the topic completely seriously.

What Inspired Us

In the middle of all the deadlines and stressful school life, we really hope you still keep a little sense of humor.

We built this project to bring joy through absurdity. The concept is simple: take the most ridiculous topics imaginable and have iconic characters debate them with complete seriousness. There's something inherently funny about hearing SpongeBob argue passionately about whether cereal is soup, or Trump declaring victory in the great "toilet paper orientation" debate.

Our goal was to make people smile, laugh, or feel immersed - turning boredom into fun, one ridiculous debate at a time.

How We Built It

Architecture Overview

The application follows a modern full-stack architecture:

Frontend (Next.js) - Handles user input, character selection, and audio playback with real-time speaker highlighting
Backend (FastAPI) - Orchestrates AI services and manages episode generation
AI Services - OpenAI for script generation, ElevenLabs and TopMediAI for voice cloning and synthesis

Script Generation

We use OpenAI's API with structured output to generate debate scripts. The prompt engineering ensures:

Characters stay in their iconic personalities
Characters can reference the worlds and shows that they came from
Each speaker builds on previous arguments (context-aware)
Forced disagreement creates comedic tension
Episodes stay within the 60-90 second target

Frontend Experience

The playback interface features:

Character avatars with active speaker highlighting
Real-time synchronized subtitles
Audio waveform visualization
Smooth animations using Motion library

Tech Stack

Frontend:

Next.js 14+ (App Router)
TypeScript
Tailwind CSS
Shadcn/ui components
Motion (animations)

Backend:

FastAPI (Python 3.11+)
Python
Pydantic (data validation)
OpenAI API (text generation)
ElevenLabs API (text-to-speech & voice cloning)
TopMediAI API (text-to-speech & voice cloning)

How to Run

Prerequisites

Node.js 18+
Python 3.11+
OpenAI API key
ElevenLabs API key
TopMediAI API key

Frontend Setup

cd frontend
npm install
npm run dev

The frontend runs at http://localhost:3000

Backend Setup

cd backend
conda activate [your-env]
pip install .
uvicorn app.main:app --reload

The backend runs at http://localhost:8000

Environment Variables

Create a .env file in the backend directory:

OPENAI_API_KEY=sk-your-openai-key
ELEVENLABS_API_KEY=your-elevenlabs-key
TOPMEDIAI_API_KEY=your-topmediai-key

For the frontend, create .env.local:

NEXT_PUBLIC_API_URL=http://localhost:8000

Challenges We Faced

Multi-Round Context-Aware Debate

Ensuring characters maintain context and build on each other's arguments across the episode was tricky. We solved this with careful prompt engineering that includes conversation history in each generation step.

API Error Handling

Managing failures and retries across OpenAI and ElevenLabs API calls required robust error handling. Network issues, rate limits, and API timeouts all needed graceful handling to provide a good user experience.

Episode Timing Control

Keeping generated content within the 60-90 second target required balancing script length with speech rate. We iterated on prompt constraints and added validation to ensure episodes hit the sweet spot.

What We Learned

AI Integration - Working with OpenAI API for structured content generation and ElevenLabs for voice synthesis taught us about prompt engineering, API orchestration, and handling async workflows
Full-Stack Development - Building a complete Next.js + FastAPI application with real-time audio playback deepened our understanding of modern web architecture
Prompt Engineering - Crafting prompts to generate structured, comedic debate content with consistent character voices was both an art and a science