Inspiration

Every student knows the feeling — it's 2AM, you're watching a recorded lecture, and you hit a concept you don't understand. You pause the video. You have no one to ask. You either give up, or spend an hour Googling answers that weren't written for your course. We built Course Assistant AI because that moment of confusion shouldn't have to wait until office hours.

What it does

Course Assistant AI is a voice-powered study companion that lives inside your lecture player. Upload your course materials — PDFs, slides, and lecture videos — and the system ingests everything into a unified knowledge base. When you hit a confusing moment, press the mic button and speak your question out loud. The system:

  • Extracts the transcript window around your exact timestamp (±5 minutes)
  • Retrieves the most relevant chunks from your PDFs and slides using semantic search
  • Sends everything to an LLM that answers only from your course content
  • Returns an answer with a confidence score $c \in [0, 1]$
  • If $c < 0.70$, automatically drafts a structured email to your TA and sends it

No more generic answers. No more unanswered doubts.

How we built it

We built a hybrid retrieval pipeline combining timestamp-anchored extraction and RAG (Retrieval Augmented Generation):

Ingestion layer

  • PyMuPDF extracts text page-by-page from PDFs
  • python-pptx extracts slide content from PowerPoint files
  • Whisper transcribes lecture videos with word-level timestamps

Retrieval layer

  • Course documents are chunked into 512-token pieces with 50-token overlap
  • Each chunk is embedded using sentence-transformers (all-MiniLM-L6-v2) running entirely locally — no API calls, no cost
  • Embeddings are stored in ChromaDB for fast similarity search

Reasoning layer

  • At query time, the student's question + video timestamp triggers two retrievals: the transcript slice around $T \pm 300s$ and top-$k$ semantic chunks from docs
  • The combined context is passed to llama-3.3-70b-versatile via the Groq API
  • The model is instructed to output a structured JSON response: json { "answer": "...", "confidence": 0.87, "confidence_reason": "Directly covered in lecture at 44:30", "sources": ["week3.pdf", "lecture_transcript"] }

Escalation layer

  • If confidence $c < 0.70$, a second LLM call drafts a professional TA email including the student's question, timestamp in MM:SS format, and the reason the system couldn't answer confidently
  • The email is sent via Gmail API

Backend: Flask REST API (POST /ask, POST /ingest, POST /upload)
Frontend: HTML5 video player with mic overlay, answer panel, confidence bar

Challenges we ran into

  • No OpenAI credits — we switched from OpenAI embeddings to a local sentence-transformers model mid-build. This actually made the system faster and free to run.
  • Timestamp alignment — mapping PPT slides to video timestamps required careful indexing so the context window always captured the right content.
  • Confidence calibration — getting the LLM to produce honest, conservative confidence scores (rather than inflated ones) required careful prompt engineering. We found that explicitly telling the model to default low when uncertain was the key instruction.
  • Context window management — a ±5 minute transcript window could be very large for a fast-talking lecturer. We had to balance context richness against token limits.

Accomplishments that we're proud of

  • Built a fully working end-to-end pipeline in 12 hours
  • Zero external embedding costs — the entire embedding layer runs locally
  • The confidence gate actually works — low-confidence questions consistently trigger escalation on out-of-syllabus topics
  • The TA email is genuinely useful — structured, timestamped, and professionally written by the model every time
  • Real course PDFs from an actual university course were used as the knowledge base — this wasn't a toy demo

What we learned

  • RAG is not always the right tool. For video content, timestamp-anchored extraction outperforms pure vector search because it preserves the natural flow of explanation around the moment of confusion.
  • Confidence scoring in LLMs requires explicit prompt design. Without instruction, models tend toward overconfidence. The formula we converged on:

$$c = \frac{\text{evidence strength} \times \text{source coverage}} {\text{question complexity}}$$

  • Local embedding models are production-ready. all-MiniLM-L6-v2 matched OpenAI embedding quality for this use case at zero cost.
  • Claude Code dramatically accelerated development — entire modules were scaffolded, debugged, and integrated through natural language prompts.

What's next for Glean

  • Live lecture mode — real-time question answering during a live class using a running audio stream
  • Spaced repetition integration — automatically generate flashcards from the doubts a student asked, turning confusion into long-term memory
  • Multi-student aggregation — if 10 students ask similar questions at the same timestamp, surface that to the instructor as a weak point in the lecture
  • LMS integration — connect directly to Canvas, Moodle, or Blackboard so course materials are ingested automatically without any uploads
  • Confidence analytics dashboard — show instructors which topics consistently produce low-confidence answers across the class

Built With

Share this project:

Updates