🎙️ PodGen - AI Podcast Generator

Winter 30 Hackathon Submission

Transform any text content into engaging, natural-sounding podcast conversations with AI-powered script generation and neural text-to-speech.

Demo Video: https://youtu.be/nE_eR3pYIU8

🌟 Features

Content Input Options

📄 File Upload: PDF, TXT, DOCX, HTML, EPUB
🔗 URL Paste: Articles, blogs, Wikipedia pages
🔍 Online Search: AI-powered research using Groq
✏️ Direct Paste: Copy-paste text content

Script Generation

🎭 Two Distinct Characters: Priya (Female Host) & Arjun (Male Co-host)
🗣️ Natural Hinglish: 74% English, 20% Hindi, 3% conversational fillers, 3% formal pauses
💬 Conversational Elements: "Hmmm...", "Acccha...", "Is that so?", "Ohh, I see!"
✨ Professional Tone: Dignified language, no slang

Audio Generation

🔊 Edge TTS Neural Voices: High-quality Microsoft text-to-speech
🇮🇳 Multi-Language Support: English, Hindi, Tamil, Telugu, Bengali, Kannada, Malayalam, Marathi, Gujarati
🎧 Voice Preview: Test voices before generating
📥 Downloadable MP3: Save combined podcast file

Modern UI/UX

🎨 Beautiful Audio Player: Canvas waveform visualization
📝 Editable Names: Rename projects and audio files
💾 Auto-Save: Local storage persistence
📱 Responsive Design: Works on all devices

🚀 Quick Start

Option 1: One-Click Start (Recommended)

Mac/Linux:

chmod +x start.sh
./start.sh

Windows:

start.bat

This will automatically:

Install all dependencies
Start the backend server (port 8000)
Start the frontend server (port 5173)

Option 2: Manual Setup

Backend:

cd backend
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Create .env file with your Groq API key
echo "GROQ_API_KEY=your_api_key_here" > .env

python main.py

Frontend:

npm install
npm run dev

Option 3: Google Colab (No Setup Required)

Open Podcast_Creator_Colab.ipynb in Google Colab
Run all cells sequentially
Follow interactive prompts

📍 Access Points

Frontend UI: http://localhost:5176
Backend API: http://localhost:8000
API Documentation: http://localhost:8000/docs

🛠️ Technology Stack

Component	Technology
Frontend	React 18, TypeScript, Tailwind CSS, Vite
Backend	FastAPI, Python 3.9+
AI/LLM	Groq (Llama 3.1-8B Instant)
TTS	Edge TTS (Microsoft Neural Voices)
Audio	HTML5 Canvas Visualization

📁 Project Structure

podgen-ai-podcast-generator/
├── src/                          # React frontend source
│   ├── app/
│   │   ├── App.tsx              # Main application
│   │   └── components/
│   │       ├── UploadStep.tsx   # Step 1: Content input
│   │       ├── ScriptStep.tsx   # Step 2: Script generation
│   │       └── AudioStep.tsx    # Step 3: Audio synthesis
│   └── styles/                  # CSS and themes
├── backend/
│   ├── main.py                  # FastAPI server
│   ├── requirements.txt         # Python dependencies
│   └── .env                     # API keys (create this)
├── Winter 30 Hackathon deliverables/
│   ├── backend/                 # Standalone backend
│   ├── frontend/                # Standalone frontend
│   └── docs/                    # Documentation
├── Podcast_Creator_Colab.ipynb  # Google Colab notebook
├── start.sh                     # Mac/Linux startup
├── start.bat                    # Windows startup
├── package.json                 # Node.js config
└── README.md                    # This file

🔧 API Endpoints

Content Processing

POST /api/content/wikipedia - Fetch Wikipedia article
POST /api/content/perplexity - Search and summarize topics
POST /api/content/url - Extract content from URL
POST /api/content/upload - Process uploaded files

Script Generation

POST /api/script/generate - Generate conversational script
POST /api/script/summarize - Summarize long content

Audio Generation

POST /api/audio/generate - Generate podcast audio from script
GET /audio/{filename} - Serve generated audio files

📖 Usage Guide

Step 1: Content Input

Choose your content source:
- Wikipedia: Search by topic
- URL: Paste article link
- Upload: Select file (PDF, TXT, DOCX, etc.)
- Search: AI-powered topic research
Click "Upload content from source"
Preview content in the right panel

Step 2: Script Generation

Content is automatically displayed
Click "Generate script"
Review the generated conversation between Priya and Arjun
Click "Next" to proceed

Step 3: Audio Generation

Select voice for Priya (P1)
Select voice for Arjun (P2)
Click "Generate audio"
Listen to the podcast in the player
Download the MP3 file

🎯 Key Innovations

Hinglish Prompting: Carefully engineered prompts for natural Hindi-English code-switching
Character Consistency: Fixed roles (Priya/Arjun) with gender-appropriate voices
Canvas Waveform: Smooth 60fps audio visualization
Smart Naming: Auto-extracts 1-2 keywords from content for project titles
Multi-Source Content: Flexible input from various sources

🔑 API Keys Setup

Get Groq API Key (Required):
- Visit: https://console.groq.com/keys
- Sign up for free account
- Create new API key

Create .env file:

cd backend
echo "GROQ_API_KEY=your_actual_key_here" > .env

Restart backend if already running

🐛 Troubleshooting

Backend won't start

✅ Ensure Python 3.8+ installed: python3 --version
✅ Check port 8000 available: lsof -i :8000
✅ Verify GROQ_API_KEY in .env file
✅ Install dependencies: pip install -r backend/requirements.txt

Frontend won't start

✅ Ensure Node.js 16+ installed: node --version
✅ Check port 5173 available: lsof -i :5173
✅ Install dependencies: npm install
✅ Clear cache: rm -rf node_modules package-lock.json && npm install

Content fetch fails

✅ Backend must be running on port 8000
✅ Check browser console for errors
✅ Verify GROQ_API_KEY is valid
✅ Check internet connection

Audio generation issues

✅ Edge TTS requires internet connection
✅ Check backend console for errors
✅ Verify script was generated successfully
✅ Ensure audio_output directory exists in backend/

📦 Deliverables

This project includes complete deliverables for the Winter 30 Hackathon:

✅ Full-stack web application (React + FastAPI)
✅ Google Colab notebook for standalone use
✅ Complete documentation and setup instructions
✅ Demo video and usage guide
✅ Source code with detailed comments
✅ Attribution and license information

See Winter 30 Hackathon deliverables/ folder for organized submission files.

📝 Notes

Groq API offers generous free tier (14,400 requests/day)
Edge TTS is free and requires no API key
All generated content saved locally in audio_output/
Project state persists in browser local storage
Works best with English content for Hinglish output

📄 License

This project is for personal and educational use.

🤝 Contributing

Feel free to submit issues or pull requests!

Built With

cursor
javascript
jupyter-notebook
ml
python
vibecoding

Updates

Shrikrithika N started this project — Apr 03, 2026 12:01 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.