VidVoice
VidVoice is an AI-powered Streamlit application that generates YouTube scripts and voiceovers based on video titles. It helps content creators automate video production with minimal effort.
Live Demo
Features
- AI-Powered Script Generation: Generates high-quality YouTube scripts based on the video title.
- AI Voiceovers: Converts scripts into natural-sounding voiceovers using TTS models.
- Multiple AI Models: Supports Google Gemini, Groq, and ElevenLabs APIs for text and speech generation.
- Editable Scripts: Modify generated scripts before converting them to voiceovers.
- Easy Download: Listen to and download generated voiceovers.
Requirements
- Google Gemini API Key
- Groq API Key
- ElevenLabs API Key (for advanced TTS)
Installation
- Clone the repository:
bash git clone https://github.com/your-username/VidVoice.git cd VidVoice - Install dependencies:
bash pip install -r requirements.txt - Run the Streamlit app:
bash streamlit run app.py
Usage
1. Generate YouTube Scripts
- Enter the video title.
- Set the desired video length.
- Click "Generate Script" to create the script.
- Edit the script if needed.
2. Generate Voiceovers
- Select a TTS model from the sidebar.
- Click "Generate Audio" to convert the script into a voiceover.
- Download the generated audio.
3. Regenerate & Edit
- Click "Regenerate Script" for a new version.
- Edit the script and generate audio again if needed.
Future Improvements
✅ Local AI Models – Reduce API dependency by integrating local text and voice generation models.
✅ Voice Cloning – Allow users to choose from custom and cloned voices.
✅ Multi-Language Support – Expand voice generation capabilities to multiple languages.
Inspiration
Creating YouTube content can be time-consuming, requiring both scripting and voiceovers. We wanted to simplify this process using AI, making video creation effortless for everyone, from beginners to professionals.
What it does
VidVoice generates high-quality YouTube scripts based on video titles and converts them into natural-sounding voiceovers using AI-powered TTS models. It streamlines content creation by automating scriptwriting and narration in a few clicks.
How we built it
- Frontend: Streamlit for an interactive and user-friendly interface.
- AI Models: Google Gemini and Groq for script generation, ElevenLabs for TTS.
- Backend: Python for API integration and text processing.
- Hosting: Deployed using Streamlit Cloud for easy access.
Challenges we ran into
- Ensuring high-quality script generation across different video topics.
- Optimizing voiceover output for natural and engaging narration.
- Managing API rate limits and response times for seamless performance.
Accomplishments that we're proud of
- Successfully automating the entire YouTube content creation workflow.
- Integrating multiple AI models for both text and speech generation.
- Providing a free and accessible tool for content creators.
What we learned
- How to optimize AI-generated content for better readability and engagement.
- The importance of voiceover quality in enhancing video production.
- Balancing API dependencies while maintaining performance and affordability.
What's next for VidVoice
✅ Local AI models to reduce dependency on external APIs.
✅ Voice cloning for custom and personalized narrations.
✅ Multi-language support for a global reach.
✅ Background music & sound effects to enhance audio production.
VidVoice
Generating Voice over audio for Youtube Videos based on Youtube Title
This project is a Streamlit application that leverages various AI models to generate YouTube scripts and voiceovers. It aims to assist content creators in producing high-quality YouTube videos with minimal effort.
- No Elven labs feautre in this Link
Audio Demo
- Title : what is phenomenon beyond northern lights --- Demo Samlpe Link
WorkFlow
Requirements
Installation
Clone the repository:
git clone https://github.com/your-username/AI-YouTube-Voice-Over-Generator.git cd AI-YouTube-Voice-Over-GeneratorInstall the required Python packages:
pip install -r requirements.txtRun the Streamlit application:
streamlit run app.py
Usage
API Keys
- Gemini Models: Enter your Gemini API key in the sidebar if you are using Gemini-based text generation models.
- Groq Models: Enter your Groq API key in the sidebar if you are using Groq-based text generation models.
- ElevenLabs TTS: Enter your ElevenLabs API key in the sidebar if you select the ElevenLabs text-to-speech model.
Generating Scripts
- Enter Video Title: Input the title for your YouTube video.
- Set Video Length: Specify the desired video length in minutes.
- Generate Script: Click the "Generate Script" button.
- Edit Script: Modify the generated script if needed.
Generating Voiceovers
- Select TTS Model: Choose a TTS model in the sidebar.
- Generate Audio: Click the "Generate Audio" button after the script is ready.
- Download Audio: Listen to and download the generated audio.
Regenerating and Editing
- Regenerate Script: Click to create a new script if desired.
- Edit Script: Update the script and convert it to audio.
Contributing
Contributions are welcome! Please open an issue or submit a pull request if you have suggestions for improvements or new features.
Acknowledgments
- mrfakename for Hosting freely Hosting MELLO TTS model, without there contribution it woudn't be a possibilty to provide free TTS service to end users
Improvements
- Local Text Generation: Implement the use of local text generation models to enhance performance and reduce dependency on external APIs.
- Local TTS Models: Integrate local TTS models for better audio generation and faster processing times.
- Voice Cloning: Allow users to choose from a variety of voices for TTS, including options for voice cloning based on user preferences.
Built With
- ai
- elevenlabs-api
- googleai
Log in or sign up for Devpost to join the conversation.