Lecture2quiz

Project Story

Inspiration

As students in Southeast Asia, we attend lectures delivered in mixed languages — English sprinkled with Vietnamese, Thai, Singlish, or Bahasa. Taking notes manually is exhausting and incomplete. Existing transcription tools (Whisper, Google STT) struggle badly with SEA accents and code-switching.

We asked: What if you could just record a lecture and instantly get a complete bilingual study pack — summary, quiz, and flashcards — in both English and your native language?

When we discovered Valsea — a speech AI platform purpose-built for Southeast Asian languages — the idea clicked.

What We Built

Lecture2Quiz SEA is a full-stack pipeline that transforms classroom audio into a ready-to-use study pack:

Transcribe — Valsea STT with accent-aware models (supports 70+ languages including Singlish, Vietnamese, Thai, Filipino)
Clarify — Valsea cleans noisy/colloquial speech into grammatically correct text
Summarize — Valsea formats the transcript into key quotes, overview, and takeaways
Generate Quiz — AWS Bedrock (Claude) creates 10 MCQ questions from the content
Generate Flashcards — Leveled cards (easy/medium/hard) for spaced repetition
Translate — Everything output in both English and the student's chosen language

How We Built It

Backend: Python FastAPI orchestrating the Valsea API pipeline (transcribe → clarify → format → translate) and AWS Bedrock for quiz/flashcard generation. All steps run in parallel where possible using asyncio.gather.
Frontend: React + Tailwind CSS with real-time progress via Server-Sent Events (SSE). Includes an interactive quiz room and a flip-card spaced-repetition deck.
Smart audio handling: Files over 8 MB are automatically split into ~4.5-min chunks via ffmpeg, transcribed in parallel, then recombined — no manual preprocessing needed.

Challenges We Faced

Large file uploads — Valsea has a 10 MB limit per request. We solved this by building an auto-splitter that chunks audio and transcribes in parallel ($n = 3$ concurrent requests by default), then recombines in order.
Network reliability — Uploading large audio over unstable connections (VPN, campus Wi-Fi) caused ReadError mid-upload. We implemented retries with exponential backoff and better error messaging.
Bedrock throttling — AWS rate-limits Claude API calls. We added configurable retry logic with BEDROCK_QUIZ_MAX_RETRIES and context truncation to stay under token limits: $$\text{context_chars} \in [4000, 120000], \quad \text{default} = 24000$$
Quiz quality — Getting Claude to produce exactly 10 well-formed MCQ items with consistent JSON schema required careful prompt engineering and validation.
Adaptive learning loop — Making quiz misses automatically become flashcards required a client-side state machine tracking localStorage sessions across quiz and flashcard pages.

What We Learned

Valsea's clarify endpoint is a game-changer — it turns messy spoken language into clean text that LLMs can actually reason about
Running transcription, formatting, and generation in parallel cuts total latency by ~60%
SSE provides a much better UX than polling for long-running pipelines
Building for SEA languages requires purpose-built tools — generic models consistently fail on accents and code-switching

What's Next

Real-time live transcription via Valsea WebSocket (valsea-rtt)
Persistent lecture library with analytics (weak topics, study streaks)
Export to Anki deck format
Mobile PWA for on-the-go review

Built With

aws-bedrock
docker
fastapi
python
react-js
tailwind
valsea

Updates

Doanh Lê started this project — May 04, 2026 11:34 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.