VidVoice

VidVoice is an AI-powered Streamlit application that generates YouTube scripts and voiceovers based on video titles. It helps content creators automate video production with minimal effort.

Live Demo

VidVoice Streamlit App

Features

  • AI-Powered Script Generation: Generates high-quality YouTube scripts based on the video title.
  • AI Voiceovers: Converts scripts into natural-sounding voiceovers using TTS models.
  • Multiple AI Models: Supports Google Gemini, Groq, and ElevenLabs APIs for text and speech generation.
  • Editable Scripts: Modify generated scripts before converting them to voiceovers.
  • Easy Download: Listen to and download generated voiceovers.

Requirements

  • Google Gemini API Key
  • Groq API Key
  • ElevenLabs API Key (for advanced TTS)

Installation

  1. Clone the repository:
    bash git clone https://github.com/your-username/VidVoice.git cd VidVoice
  2. Install dependencies:
    bash pip install -r requirements.txt
  3. Run the Streamlit app:
    bash streamlit run app.py

Usage

1. Generate YouTube Scripts

  • Enter the video title.
  • Set the desired video length.
  • Click "Generate Script" to create the script.
  • Edit the script if needed.

2. Generate Voiceovers

  • Select a TTS model from the sidebar.
  • Click "Generate Audio" to convert the script into a voiceover.
  • Download the generated audio.

3. Regenerate & Edit

  • Click "Regenerate Script" for a new version.
  • Edit the script and generate audio again if needed.

Future Improvements

Local AI Models – Reduce API dependency by integrating local text and voice generation models.
Voice Cloning – Allow users to choose from custom and cloned voices.
Multi-Language Support – Expand voice generation capabilities to multiple languages.

Inspiration

Creating YouTube content can be time-consuming, requiring both scripting and voiceovers. We wanted to simplify this process using AI, making video creation effortless for everyone, from beginners to professionals.

What it does

VidVoice generates high-quality YouTube scripts based on video titles and converts them into natural-sounding voiceovers using AI-powered TTS models. It streamlines content creation by automating scriptwriting and narration in a few clicks.

How we built it

  • Frontend: Streamlit for an interactive and user-friendly interface.
  • AI Models: Google Gemini and Groq for script generation, ElevenLabs for TTS.
  • Backend: Python for API integration and text processing.
  • Hosting: Deployed using Streamlit Cloud for easy access.

Challenges we ran into

  • Ensuring high-quality script generation across different video topics.
  • Optimizing voiceover output for natural and engaging narration.
  • Managing API rate limits and response times for seamless performance.

Accomplishments that we're proud of

  • Successfully automating the entire YouTube content creation workflow.
  • Integrating multiple AI models for both text and speech generation.
  • Providing a free and accessible tool for content creators.

What we learned

  • How to optimize AI-generated content for better readability and engagement.
  • The importance of voiceover quality in enhancing video production.
  • Balancing API dependencies while maintaining performance and affordability.

What's next for VidVoice

Local AI models to reduce dependency on external APIs.
Voice cloning for custom and personalized narrations.
Multi-language support for a global reach.
Background music & sound effects to enhance audio production.

VidVoice

Generating Voice over audio for Youtube Videos based on Youtube Title

This project is a Streamlit application that leverages various AI models to generate YouTube scripts and voiceovers. It aims to assist content creators in producing high-quality YouTube videos with minimal effort.

  • No Elven labs feautre in this Link

Audio Demo

WorkFlow

Workflow

Requirements

Installation

  1. Clone the repository:

    git clone https://github.com/your-username/AI-YouTube-Voice-Over-Generator.git
    cd AI-YouTube-Voice-Over-Generator
    
  2. Install the required Python packages:

    pip install -r requirements.txt
    
  3. Run the Streamlit application:

    streamlit run app.py
    

Usage

API Keys

  • Gemini Models: Enter your Gemini API key in the sidebar if you are using Gemini-based text generation models.
  • Groq Models: Enter your Groq API key in the sidebar if you are using Groq-based text generation models.
  • ElevenLabs TTS: Enter your ElevenLabs API key in the sidebar if you select the ElevenLabs text-to-speech model.

Generating Scripts

  1. Enter Video Title: Input the title for your YouTube video.
  2. Set Video Length: Specify the desired video length in minutes.
  3. Generate Script: Click the "Generate Script" button.
  4. Edit Script: Modify the generated script if needed.

Generating Voiceovers

  1. Select TTS Model: Choose a TTS model in the sidebar.
  2. Generate Audio: Click the "Generate Audio" button after the script is ready.
  3. Download Audio: Listen to and download the generated audio.

Regenerating and Editing

  • Regenerate Script: Click to create a new script if desired.
  • Edit Script: Update the script and convert it to audio.

Contributing

Contributions are welcome! Please open an issue or submit a pull request if you have suggestions for improvements or new features.

Acknowledgments

  • mrfakename for Hosting freely Hosting MELLO TTS model, without there contribution it woudn't be a possibilty to provide free TTS service to end users

Improvements

  • Local Text Generation: Implement the use of local text generation models to enhance performance and reduce dependency on external APIs.
  • Local TTS Models: Integrate local TTS models for better audio generation and faster processing times.
  • Voice Cloning: Allow users to choose from a variety of voices for TTS, including options for voice cloning based on user preferences.

Built With

  • ai
  • elevenlabs-api
  • google
  • googleai
Share this project:

Updates