Inspiration

The Problem

Every semester, millions of students sit through hours of lectures, frantically scribbling notes while trying to absorb complex concepts. By the time exams arrive, their notes are incomplete, disorganized, or worse—nonexistent.

Sound familiar? You've probably been there:

  • Rewatching 2-hour lectures at 2x speed
  • Pausing every 30 seconds to capture slide content
  • Manually creating study materials that should have been generated automatically
  • Spending 4-6 hours per lecture on note-taking, review, and creating study materials

Meanwhile, educators struggle to provide personalized study resources for diverse learning styles and difficulty levels.

Our Solution

We built LectureIQ to transform this broken learning workflow. By leveraging Gemini's advanced multimodal reasoning, we turn any lecture into an interactive mini-course—automatically generating structured notes, intelligent flashcards, and adaptive assessments in minutes.

What It Does

LectureIQ transforms passive lecture consumption into active mastery. Here's the complete user journey:

Upload & Process

Students upload lecture videos (MP4, AVI, MOV, WebM) and optional slide decks (PDF). Our system extracts audio, generates timestamped transcripts, and extracts slide content—all processed efficiently and securely.

AI-Powered Analysis

Gemini analyzes the entire lecture context using multimodal long-context capabilities:

  • Video transcript with speaker annotations and timestamps
  • Slide images showing diagrams, equations, and visual content
  • Multimodal understanding of relationships between spoken, visual, and written content

Unlike simple transcription tools, LectureIQ understands the full context of what's being taught.

Intelligent Content Generation

The system produces three interconnected study materials:

📝 Structured Notes

  • Hierarchical outlines with clear headings and subheadings
  • Key concepts, definitions, and examples from audio and visual content
  • Visual descriptions of diagrams and equations
  • Context-aware explanations referencing slide content
  • Tagged key concepts for quick navigation

🎴 Active Recall Flashcards

  • Concept-based cards following cognitive science principles
  • Varying question types: definitions, applications, comparisons, processes
  • Difficulty scaling: beginner, intermediate, advanced
  • Linked to source sections for review

✅ Adaptive Multiple Choice Quizzes

  • Understanding-focused questions, not memorization tests
  • Four plausible options with detailed explanations
  • Covers common misconceptions as distractors
  • Difficulty levels aligned with Bloom's taxonomy

How We Built It

Frontend Architecture

  • Next.js 14 with TypeScript for modern, type-safe React
  • Tailwind CSS for responsive, beautiful UI components
  • Zustand for efficient state management
  • React Query for optimized API calls and caching
  • IndexedDB for browser-based data persistence
  • Custom components: flashcard flip animations, quiz interface, note viewer

Backend Architecture

  • FastAPI (Python 3.10+) for high-performance async API endpoints
  • PyAV 16.1.0 for robust video/audio extraction with embedded ffmpeg
  • OpenAI Whisper for 99%+ accurate speech-to-text transcription
  • PyPDF2 & Pillow for slide extraction and image processing
  • Python-dotenv for secure environment configuration

AI Integration (The Core Innovation)

We use Gemini API as the reasoning engine with multimodal capabilities. The system processes transcript segments aligned with slide images and generates structured educational content using specialized prompts for notes → flashcards → MCQs, with quality validation to prevent hallucinations.

Key Technical Decisions:

  • Chunking strategy: Split lectures into 5-10 minute segments for optimal context
  • Prompt chaining: Separate specialized prompts for each content type
  • Quality validation: Post-processing filters ensure accuracy
  • Temperature optimization: Set to 0.3 for consistency and reliability

Challenges We Ran Into

1. Transcript-Slide Synchronization

Problem: Lectures often don't follow slides linearly. Speakers jump back, reference earlier slides, or display one slide while discussing another topic.

Solution: Hybrid approach with time-based initial alignment, visual similarity scoring for better matching, manual override option for complex lectures, and contextual hints to Gemini for intelligent pairing.

2. Prompt Engineering for Quality

Problem: Early iterations produced flashcards with one-word answers or MCQs with obvious wrong answers.

Solution: Few-shot prompting with 3-4 examples of excellent content, explicit constraint reinforcement, tested with 50+ diverse lectures across subjects, and quality scoring with post-generation validation checks.

3. Hallucination Prevention

Problem: Gemini occasionally "filled in gaps" with reasonable-sounding but incorrect information.

Solution: Temperature tuning set to 0.3 for consistency, explicit constraints ("Only use information directly stated in transcript or slides"), source attribution to force citations, and human-in-the-loop approach allowing users to flag and regenerate sections.

4. Processing Time Optimization

Problem: Initial implementation took 12-15 minutes per hour of lecture.

Solution: Parallel processing of multiple segments, batched Gemini API calls, intermediate result caching. Result: Reduced to 3-5 minutes per lecture hour.

Accomplishments That We're Proud Of

Deep Gemini Integration

We didn't just call an API—we architected the entire application around Gemini's unique multimodal capabilities. Every feature showcases what makes Gemini special: understanding visual + textual context simultaneously, maintaining coherence over long documents, and generating structured educational content.

Cross-Domain Validation

LectureIQ successfully processes lectures across diverse subjects:

  • Mathematics: Calculus with complex equations
  • Computer Science: Algorithms with code examples
  • Humanities: History with timeline slides

The system consistently generates high-quality, domain-appropriate study materials.

Educational Science Integration

Our content generation follows research-backed principles: active recall spacing, desirable difficulty scaling, Bloom's taxonomy question types, and misconception-based distractors.

Complete Learning Pipeline

Built the entire journey from start to finish: Upload → Process → Notes → Flashcards → Quizzes → Export.

Zero Server Database - Pure Browser Storage

LectureIQ uses IndexedDB for client-side storage, eliminating server database complexity while maintaining full functionality. This innovative approach ensures data privacy, instant access, and zero infrastructure overhead.

What We Learned

Multimodal Prompt Engineering is an Art

Effective prompts need to set spatial context for images, balance different modalities (text, images, audio transcripts), and guide visual reasoning with clarity.

Long-Context Reasoning Requires Structure

Working with lengthy lectures taught us to provide clear section markers, build context incrementally, use structured output formats (JSON), and break extremely long lectures into semantic chapters.

Educational Content Generation is Hard

Creating good study materials requires domain awareness (math ≠ history), difficulty calibration (beginner to advanced), question diversity (definitions, applications, comparisons), and understanding pedagogical principles.

What's Next for LectureIQ

Spaced Repetition System

Integrate scientifically optimized review schedules with SM-2 algorithm for better long-term retention.

Collaborative Study Packs

Allow students to share generated materials, combine multiple lectures into cohesive courses, and create class-wide knowledge bases.

Real-Time Lecture Assistance

Live processing during lectures with real-time transcription display and instant key point extraction as lecture progresses.

Multi-Language Support

Leverage Gemini's multilingual capabilities to transcribe and generate study materials in 100+ languages.

LMS Integration

Build plugins for Canvas, Moodle, Blackboard, and Google Classroom with automatic lecture import.

Learning Analytics Dashboard

For educators to track which concepts students find challenging, identify gaps in lecture explanations, and generate data-driven curriculum improvements.


Why LectureIQ Matters:

  • For Students: Save 4+ hours per lecture, get structured AI-generated study aids immediately, study smarter with scientifically-backed learning tools
  • For Educators: Provide consistent study materials, support diverse learning styles, identify comprehension gaps
  • For Education: Democratize access to quality study materials, make personalized learning scalable, reduce inequities in educational resources

Stats: End-to-end AI pipeline • Production-ready with security best practices • 3-5 minutes processing per lecture hour • 99%+ transcription accuracy with Whisper • Zero server database - pure browser storage • Automatic cleanup of temporary files

Built with care for the Gemini 3 Hackathon | Powered by PyAV · Whisper · Gemini API

Built With

Share this project:

Updates