VidVoice

VidVoice is an AI-powered Streamlit application that generates YouTube scripts and voiceovers based on video titles. It helps content creators automate video production with minimal effort.

Live Demo

VidVoice Streamlit App

Features

AI-Powered Script Generation: Generates high-quality YouTube scripts based on the video title.
AI Voiceovers: Converts scripts into natural-sounding voiceovers using TTS models.
Multiple AI Models: Supports Google Gemini, Groq, and ElevenLabs APIs for text and speech generation.
Editable Scripts: Modify generated scripts before converting them to voiceovers.
Easy Download: Listen to and download generated voiceovers.

Requirements

Google Gemini API Key
Groq API Key
ElevenLabs API Key (for advanced TTS)

Installation

Clone the repository:
bash git clone https://github.com/your-username/VidVoice.git cd VidVoice
Install dependencies:
bash pip install -r requirements.txt
Run the Streamlit app:
bash streamlit run app.py

Usage

1. Generate YouTube Scripts

Enter the video title.
Set the desired video length.
Click "Generate Script" to create the script.
Edit the script if needed.

2. Generate Voiceovers

Select a TTS model from the sidebar.
Click "Generate Audio" to convert the script into a voiceover.
Download the generated audio.

3. Regenerate & Edit

Click "Regenerate Script" for a new version.
Edit the script and generate audio again if needed.

Future Improvements

✅ Local AI Models – Reduce API dependency by integrating local text and voice generation models.
✅ Voice Cloning – Allow users to choose from custom and cloned voices.
✅ Multi-Language Support – Expand voice generation capabilities to multiple languages.

Inspiration

Creating YouTube content can be time-consuming, requiring both scripting and voiceovers. We wanted to simplify this process using AI, making video creation effortless for everyone, from beginners to professionals.

What it does

VidVoice generates high-quality YouTube scripts based on video titles and converts them into natural-sounding voiceovers using AI-powered TTS models. It streamlines content creation by automating scriptwriting and narration in a few clicks.

How we built it

Frontend: Streamlit for an interactive and user-friendly interface.
AI Models: Google Gemini and Groq for script generation, ElevenLabs for TTS.
Backend: Python for API integration and text processing.
Hosting: Deployed using Streamlit Cloud for easy access.

Challenges we ran into

Ensuring high-quality script generation across different video topics.
Optimizing voiceover output for natural and engaging narration.
Managing API rate limits and response times for seamless performance.

Accomplishments that we're proud of

Successfully automating the entire YouTube content creation workflow.
Integrating multiple AI models for both text and speech generation.
Providing a free and accessible tool for content creators.

What we learned

How to optimize AI-generated content for better readability and engagement.
The importance of voiceover quality in enhancing video production.
Balancing API dependencies while maintaining performance and affordability.

What's next for VidVoice

✅ Local AI models to reduce dependency on external APIs.
✅ Voice cloning for custom and personalized narrations.
✅ Multi-language support for a global reach.
✅ Background music & sound effects to enhance audio production.

VidVoice

Generating Voice over audio for Youtube Videos based on Youtube Title

This project is a Streamlit application that leverages various AI models to generate YouTube scripts and voiceovers. It aims to assist content creators in producing high-quality YouTube videos with minimal effort.

No Elven labs feautre in this Link

Audio Demo

Title : what is phenomenon beyond northern lights --- Demo Samlpe Link

WorkFlow

Workflow

Requirements

Installation

Clone the repository:

git clone https://github.com/your-username/AI-YouTube-Voice-Over-Generator.git
cd AI-YouTube-Voice-Over-Generator

Install the required Python packages:
```
pip install -r requirements.txt
```
Run the Streamlit application:
```
streamlit run app.py
```

Usage

API Keys

Gemini Models: Enter your Gemini API key in the sidebar if you are using Gemini-based text generation models.
Groq Models: Enter your Groq API key in the sidebar if you are using Groq-based text generation models.
ElevenLabs TTS: Enter your ElevenLabs API key in the sidebar if you select the ElevenLabs text-to-speech model.

Generating Scripts

Enter Video Title: Input the title for your YouTube video.
Set Video Length: Specify the desired video length in minutes.
Generate Script: Click the "Generate Script" button.
Edit Script: Modify the generated script if needed.

Generating Voiceovers

Select TTS Model: Choose a TTS model in the sidebar.
Generate Audio: Click the "Generate Audio" button after the script is ready.
Download Audio: Listen to and download the generated audio.

Regenerating and Editing

Regenerate Script: Click to create a new script if desired.
Edit Script: Update the script and convert it to audio.

Contributing

Contributions are welcome! Please open an issue or submit a pull request if you have suggestions for improvements or new features.

Acknowledgments

mrfakename for Hosting freely Hosting MELLO TTS model, without there contribution it woudn't be a possibilty to provide free TTS service to end users

Improvements

Local Text Generation: Implement the use of local text generation models to enhance performance and reduce dependency on external APIs.
Local TTS Models: Integrate local TTS models for better audio generation and faster processing times.
Voice Cloning: Allow users to choose from a variety of voices for TTS, including options for voice cloning based on user preferences.

Built With

ai
elevenlabs-api
google
googleai

Updates

Private user started this project — Feb 23, 2025 01:36 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.