🚀 About the Project – LearnLens AI
💡 Inspiration
Learning from YouTube is powerful—but often frustrating.
Imagine you’re watching a tutorial and suddenly get stuck. Maybe it’s a math concept, a coding step, or even something practical like building an Arduino car. The video keeps going, but your understanding stops. You rewind, search comments, or try another video—losing time and momentum.
We asked a simple question:
❓ What if you could just ask the video itself?
That idea became LearnLens AI — an intelligent system that turns passive video watching into an interactive learning experience.
🧠 What It Does
LearnLens AI acts like a real-time tutor for any YouTube video.
- Ask questions at any moment (e.g., “I didn’t understand at 8:37”)
- Get instant explanations using:
- 📜 Transcripts or subtitles
- 🎧 Audio understanding
- 🖼️ Screenshot/frame analysis
- Learn actively with:
- 🧩 Quizzes generated from the video
- 📌 Key concepts and summaries
- ⚡ Step-by-step breakdowns
For example:
If you're learning how to build an Arduino car and get confused about wiring in the middle of the video, LearnLens analyzes that exact moment—whether through subtitles, audio, or visual frames—and explains what to do next clearly and practically.
🛠️ How We Built It
We built LearnLens AI using a combination of modern AI and web technologies:
- Frontend: Clean, responsive UI for seamless interaction
- Backend: AI-powered processing pipeline
- AI Models: Multimodal analysis (text + audio + image)
- Video Processing:
- Transcript & subtitle extraction
- Audio-to-text fallback
- Frame/screenshot analysis
- Chrome Extension: One-click “Analyze this moment” feature
The system intelligently switches between data sources:
$$ \text{Understanding} = f(\text{Transcript} + \text{Subtitles} + \text{Audio} + \text{Visual Frames}) $$
This ensures the AI never fails, even when transcripts are unavailable.
📚 What We Learned
- Real-world AI must handle imperfect data
- Users want instant, contextual help, not summaries
- Multimodal systems (video + audio + text) are far more powerful than text-only AI
- UX simplicity is just as important as AI capability
⚡ Challenges We Faced
- ❌ Many videos don’t have transcripts
- ❌ Auto-generated captions can be inaccurate
- ❌ Video restrictions block data access
- ❌ Understanding visuals (like wiring, diagrams) is complex
We solved this by building a fallback-first system:
- Subtitles → Audio → Screenshots → AI reasoning
🌍 Vision
LearnLens AI is not just a tool—it’s a shift in how people learn.
From:
🎥 Passive watching
To:
🤖 Interactive understanding
We believe this can revolutionize education globally, making every video an intelligent tutor and every learner empowered to ask, understand, and grow—instantly.
🚀 Final Thought
If you don’t understand something while learning, you shouldn’t have to stop.
With LearnLens AI, you just ask—and keep going.
Built With
- and
- chrome-extension-apis
- cloud
- css
- deployment
- express.js
- firebase-(auth-&-db)
- gemini-api
- html
- javascript
- node.js
- react
- youtube-data-api
Log in or sign up for Devpost to join the conversation.