learnlens_ai

ui image
profile

🚀 About the Project – LearnLens AI

💡 Inspiration

Learning from YouTube is powerful—but often frustrating.

Imagine you’re watching a tutorial and suddenly get stuck. Maybe it’s a math concept, a coding step, or even something practical like building an Arduino car. The video keeps going, but your understanding stops. You rewind, search comments, or try another video—losing time and momentum.

We asked a simple question:

❓ What if you could just ask the video itself?

That idea became LearnLens AI — an intelligent system that turns passive video watching into an interactive learning experience.

🧠 What It Does

LearnLens AI acts like a real-time tutor for any YouTube video.

Ask questions at any moment (e.g., “I didn’t understand at 8:37”)
Get instant explanations using:
- 📜 Transcripts or subtitles
- 🎧 Audio understanding
- 🖼️ Screenshot/frame analysis
Learn actively with:
- 🧩 Quizzes generated from the video
- 📌 Key concepts and summaries
- ⚡ Step-by-step breakdowns

For example:

If you're learning how to build an Arduino car and get confused about wiring in the middle of the video, LearnLens analyzes that exact moment—whether through subtitles, audio, or visual frames—and explains what to do next clearly and practically.

🛠️ How We Built It

We built LearnLens AI using a combination of modern AI and web technologies:

Frontend: Clean, responsive UI for seamless interaction
Backend: AI-powered processing pipeline
AI Models: Multimodal analysis (text + audio + image)
Video Processing:
- Transcript & subtitle extraction
- Audio-to-text fallback
- Frame/screenshot analysis
Chrome Extension: One-click “Analyze this moment” feature

The system intelligently switches between data sources:

$$ \text{Understanding} = f(\text{Transcript} + \text{Subtitles} + \text{Audio} + \text{Visual Frames}) $$

This ensures the AI never fails, even when transcripts are unavailable.

📚 What We Learned

Real-world AI must handle imperfect data
Users want instant, contextual help, not summaries
Multimodal systems (video + audio + text) are far more powerful than text-only AI
UX simplicity is just as important as AI capability

⚡ Challenges We Faced

❌ Many videos don’t have transcripts
❌ Auto-generated captions can be inaccurate
❌ Video restrictions block data access
❌ Understanding visuals (like wiring, diagrams) is complex

We solved this by building a fallback-first system:

Subtitles → Audio → Screenshots → AI reasoning

🌍 Vision

LearnLens AI is not just a tool—it’s a shift in how people learn.

From:

🎥 Passive watching

To:

🤖 Interactive understanding

We believe this can revolutionize education globally, making every video an intelligent tutor and every learner empowered to ask, understand, and grow—instantly.

🚀 Final Thought

If you don’t understand something while learning, you shouldn’t have to stop.

With LearnLens AI, you just ask—and keep going.

Built With

and
chrome-extension-apis
cloud
css
deployment
express.js
firebase-(auth-&-db)
gemini-api
html
javascript
node.js
react
youtube-data-api

Updates

Pratham Revankar started this project — Mar 30, 2026 04:37 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.