YouTube Video Lecture Notes Generator

Inspiration

As a developer passionate about leveraging AI to enhance learning and productivity, I was inspired by the countless hours spent watching educational YouTube videos and manually taking notes. Many lectures are long and dense, making it challenging to capture key insights efficiently. I envisioned a tool that could automate the process of extracting transcripts, cleaning text, and generating structured, AI-powered summaries. This project aims to democratize access to high-quality educational content by transforming passive viewing into active, organized learning experiences.

What it does

The YouTube Video Lecture Notes Generator is a full-stack web application that automatically processes YouTube videos to create comprehensive, structured notes. Users can input a single video URL, a batch of up to 10 videosThe app extracts the video transcript, cleans and processes the text, and uses Google's Gemini AI to generate detailed notes including:

Video overview and metadata (title, channel, duration)
Section-by-section breakdowns with timestamps
Detailed notes and key takeaways
Quiz questions for reinforcement
References and suggested next steps

How we built it

The project was built using a full-stack architecture:

Backend (Flask/Python): Handles API endpoints for video processing. Integrates with YouTube Transcript API for transcript extraction, pytube for metadata, and Google's Generative AI (Gemini) for summarization. Uses yt-dlp for playlist processing. Environment variables manage API keys securely.
Frontend (React/JavaScript): A responsive UI built with React, featuring forms for URL input, progress indicators, and tabbed displays for results. Styled with Tailwind CSS for a modern look.
AI Integration: Leverages Google's Gemini AI to parse transcripts and generate JSON-structured notes. Custom agents (video_agent, text_agent, summarizer_agent) modularize the logic for transcript retrieval, text cleaning, and summarization.
Deployment: Designed for local development with virtual environments and npm scripts. The app runs on Flask (port 5000) and React (port 3000).

Key technologies: Python 3.8+, Node.js 14+, Google Cloud APIs (Speech-to-Text, Generative AI), YouTube APIs, and libraries like yt-dlp and pytube.

Challenges we ran into

Several hurdles arose during development:

Google API Setup: Configuring GOOGLE_APPLICATION_CREDENTIALS and API keys was tricky, especially ensuring the service account JSON file path was correct and permissions were set. Handling rate limits and quotas for Gemini API required careful error handling.
Transcript Extraction: Not all videos have transcripts, and some are blocked by YouTube's policies. Implementing fallbacks and error messages for rate limiting or IP blocking was essential.
AI Response Parsing: Gemini's responses needed cleaning (removing markdown code blocks) and parsing into valid JSON. Edge cases like malformed AI outputs led to robust error handling.
Batch and Playlist Processing: Managing asynchronous processing for multiple videos while limiting batch sizes to prevent overload. Integrating yt-dlp for playlists added complexity.
Cross-Origin Issues: Ensuring CORS support in Flask for seamless frontend-backend communication.

Accomplishments that we're proud of

Fully Functional App: Successfully built and deployed a working full-stack application that processes YouTube content end-to-end.
AI-Powered Insights: Integrated advanced AI to generate not just summaries, but structured notes with quizzes and references, enhancing educational value.
Scalable Architecture: Modular agent-based design allows easy extension. Supports single, batch, and playlist processing with error tracking.
User-Friendly Interface: Clean, responsive UI that handles large outputs gracefully, with progress feedback and error displays.
Comprehensive Documentation: Detailed README with setup instructions, troubleshooting, and API docs.

What we learned

This project deepened my understanding of AI integration in web apps, particularly with Google's ecosystem. I learned to handle API authentication securely, parse and clean AI-generated content, and manage asynchronous tasks in Flask. Error handling for external APIs (YouTube, Google) taught resilience. Full-stack development reinforced best practices in modular code, environment management, and user experience design. Additionally, working with video metadata and transcripts highlighted the importance of data preprocessing for AI models.

What's next for YouTube Video Lecture Notes Generator

Future enhancements include:

Export Options: Add PDF/Word export for notes, with customizable templates.
User Accounts: Implement authentication to save and organize processed videos.
Advanced AI Features: Integrate more Gemini capabilities, like multi-language support or custom prompts.
Analytics Dashboard: Track usage, success rates, and user feedback.
Mobile App: Develop a React Native version for on-the-go learning.
Integration with LMS: API hooks for platforms like Moodle or Canvas.

The project could also explore open-source contributions, such as supporting more video platforms or improving AI accuracy through fine-tuning.

Built With

cors
flask
google-generative-ai
javascript
python
react
tailwind
transcript-api

Updates

Kasturi Shinde started this project — Nov 01, 2025 12:44 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.