Inspiration
The Problem
Every semester, millions of students sit through hours of lectures, frantically scribbling notes while trying to absorb complex concepts. By the time exams arrive, their notes are incomplete, disorganized, or worse—nonexistent.
Sound familiar? You've probably been there:
- Rewatching 2-hour lectures at 2x speed
- Pausing every 30 seconds to capture slide content
- Manually creating study materials that should have been generated automatically
- Spending 4-6 hours per lecture on note-taking, review, and creating study materials
Meanwhile, educators struggle to provide personalized study resources for diverse learning styles and difficulty levels.
Our Solution
We built LectureIQ to transform this broken learning workflow. By leveraging Gemini's advanced multimodal reasoning, we turn any lecture into an interactive mini-course—automatically generating structured notes, intelligent flashcards, and adaptive assessments in minutes.
What It Does
LectureIQ transforms passive lecture consumption into active mastery. Here's the complete user journey:
Upload & Process
Students upload lecture videos (MP4, AVI, MOV, WebM) and optional slide decks (PDF). Our system extracts audio, generates timestamped transcripts, and extracts slide content—all processed efficiently and securely.
AI-Powered Analysis
Gemini analyzes the entire lecture context using multimodal long-context capabilities:
- Video transcript with speaker annotations and timestamps
- Slide images showing diagrams, equations, and visual content
- Multimodal understanding of relationships between spoken, visual, and written content
Unlike simple transcription tools, LectureIQ understands the full context of what's being taught.
Intelligent Content Generation
The system produces three interconnected study materials:
📝 Structured Notes
- Hierarchical outlines with clear headings and subheadings
- Key concepts, definitions, and examples from audio and visual content
- Visual descriptions of diagrams and equations
- Context-aware explanations referencing slide content
- Tagged key concepts for quick navigation
🎴 Active Recall Flashcards
- Concept-based cards following cognitive science principles
- Varying question types: definitions, applications, comparisons, processes
- Difficulty scaling: beginner, intermediate, advanced
- Linked to source sections for review
✅ Adaptive Multiple Choice Quizzes
- Understanding-focused questions, not memorization tests
- Four plausible options with detailed explanations
- Covers common misconceptions as distractors
- Difficulty levels aligned with Bloom's taxonomy
How We Built It
Frontend Architecture
- Next.js 14 with TypeScript for modern, type-safe React
- Tailwind CSS for responsive, beautiful UI components
- Zustand for efficient state management
- React Query for optimized API calls and caching
- IndexedDB for browser-based data persistence
- Custom components: flashcard flip animations, quiz interface, note viewer
Backend Architecture
- FastAPI (Python 3.10+) for high-performance async API endpoints
- PyAV 16.1.0 for robust video/audio extraction with embedded ffmpeg
- OpenAI Whisper for 99%+ accurate speech-to-text transcription
- PyPDF2 & Pillow for slide extraction and image processing
- Python-dotenv for secure environment configuration
AI Integration (The Core Innovation)
We use Gemini API as the reasoning engine with multimodal capabilities. The system processes transcript segments aligned with slide images and generates structured educational content using specialized prompts for notes → flashcards → MCQs, with quality validation to prevent hallucinations.
Key Technical Decisions:
- Chunking strategy: Split lectures into 5-10 minute segments for optimal context
- Prompt chaining: Separate specialized prompts for each content type
- Quality validation: Post-processing filters ensure accuracy
- Temperature optimization: Set to 0.3 for consistency and reliability
Challenges We Ran Into
1. Transcript-Slide Synchronization
Problem: Lectures often don't follow slides linearly. Speakers jump back, reference earlier slides, or display one slide while discussing another topic.
Solution: Hybrid approach with time-based initial alignment, visual similarity scoring for better matching, manual override option for complex lectures, and contextual hints to Gemini for intelligent pairing.
2. Prompt Engineering for Quality
Problem: Early iterations produced flashcards with one-word answers or MCQs with obvious wrong answers.
Solution: Few-shot prompting with 3-4 examples of excellent content, explicit constraint reinforcement, tested with 50+ diverse lectures across subjects, and quality scoring with post-generation validation checks.
3. Hallucination Prevention
Problem: Gemini occasionally "filled in gaps" with reasonable-sounding but incorrect information.
Solution: Temperature tuning set to 0.3 for consistency, explicit constraints ("Only use information directly stated in transcript or slides"), source attribution to force citations, and human-in-the-loop approach allowing users to flag and regenerate sections.
4. Processing Time Optimization
Problem: Initial implementation took 12-15 minutes per hour of lecture.
Solution: Parallel processing of multiple segments, batched Gemini API calls, intermediate result caching. Result: Reduced to 3-5 minutes per lecture hour.
Accomplishments That We're Proud Of
Deep Gemini Integration
We didn't just call an API—we architected the entire application around Gemini's unique multimodal capabilities. Every feature showcases what makes Gemini special: understanding visual + textual context simultaneously, maintaining coherence over long documents, and generating structured educational content.
Cross-Domain Validation
LectureIQ successfully processes lectures across diverse subjects:
- Mathematics: Calculus with complex equations
- Computer Science: Algorithms with code examples
- Humanities: History with timeline slides
The system consistently generates high-quality, domain-appropriate study materials.
Educational Science Integration
Our content generation follows research-backed principles: active recall spacing, desirable difficulty scaling, Bloom's taxonomy question types, and misconception-based distractors.
Complete Learning Pipeline
Built the entire journey from start to finish: Upload → Process → Notes → Flashcards → Quizzes → Export.
Zero Server Database - Pure Browser Storage
LectureIQ uses IndexedDB for client-side storage, eliminating server database complexity while maintaining full functionality. This innovative approach ensures data privacy, instant access, and zero infrastructure overhead.
What We Learned
Multimodal Prompt Engineering is an Art
Effective prompts need to set spatial context for images, balance different modalities (text, images, audio transcripts), and guide visual reasoning with clarity.
Long-Context Reasoning Requires Structure
Working with lengthy lectures taught us to provide clear section markers, build context incrementally, use structured output formats (JSON), and break extremely long lectures into semantic chapters.
Educational Content Generation is Hard
Creating good study materials requires domain awareness (math ≠ history), difficulty calibration (beginner to advanced), question diversity (definitions, applications, comparisons), and understanding pedagogical principles.
What's Next for LectureIQ
Spaced Repetition System
Integrate scientifically optimized review schedules with SM-2 algorithm for better long-term retention.
Collaborative Study Packs
Allow students to share generated materials, combine multiple lectures into cohesive courses, and create class-wide knowledge bases.
Real-Time Lecture Assistance
Live processing during lectures with real-time transcription display and instant key point extraction as lecture progresses.
Multi-Language Support
Leverage Gemini's multilingual capabilities to transcribe and generate study materials in 100+ languages.
LMS Integration
Build plugins for Canvas, Moodle, Blackboard, and Google Classroom with automatic lecture import.
Learning Analytics Dashboard
For educators to track which concepts students find challenging, identify gaps in lecture explanations, and generate data-driven curriculum improvements.
Why LectureIQ Matters:
- For Students: Save 4+ hours per lecture, get structured AI-generated study aids immediately, study smarter with scientifically-backed learning tools
- For Educators: Provide consistent study materials, support diverse learning styles, identify comprehension gaps
- For Education: Democratize access to quality study materials, make personalized learning scalable, reduce inequities in educational resources
Stats: End-to-end AI pipeline • Production-ready with security best practices • 3-5 minutes processing per lecture hour • 99%+ transcription accuracy with Whisper • Zero server database - pure browser storage • Automatic cleanup of temporary files
Built with care for the Gemini 3 Hackathon | Powered by PyAV · Whisper · Gemini API
Built With
- fastapi
- gemini-api
- indexeddb
- javascript
- next.js
- pyav
- python
- react
- tailwind-css
- typescript
- whisper
- zustand
Log in or sign up for Devpost to join the conversation.