Inspiration

One of our teammates has ADHD and lectures have always been a challenge to sit through, especially the 2 hour long lectures COMP 1870 you lose track of it for 10 minutes and then get back to it and you're completely lost on what's going on anymore. The other one was the for the shy introverted people who are either under the impression their question is too stupid or are too afraid to ask them. We wanted a solution that could help these people and many more in trying to keep them more engaged during lectures essentially making sure everyone had that one really smart friend that knew the material in and out that you could ask questions to without feeling bad about it.

What We Learned

  • RAG quality depends heavily on retrieval and prompt structure, not just the model.
  • Real‑time audio is a different problem than batch transcription — it needs streaming APIs and careful audio formatting.
  • Productizing AI means building UX guardrails (confidence labels, citations, and fallbacks).
  • Convex + Cloud Run is a fast stack, but you have to handle auth, env configuration, and deployment details with care.

How We Built It

  • Frontend: Next.js app with two tailored dashboards (student + lecturer).
  • Backend: Python FastAPI service on Cloud Run handling RAG, ingestion, and Live transcription.
  • RAG: Text chunks embedded and stored in Pinecone. Gemini answers with citations and confidence.
  • Live Transcript: Lecturer streams mic audio through Gemini Live; transcript segments are stored and used as additional context for Q&A.
  • State + Auth: Convex for server actions and data, Clerk for auth.

Challenges We Faced

  • Model access and API behavior: Different Gemini models support different endpoints. We had to align the Live API model with bidiGenerateContent support.
  • Audio pipeline issues: Decoding MediaRecorder chunks was unreliable, so we moved to raw PCM streaming for stability.
  • Deployment gaps: Vercel builds required Convex codegen and correct deployment keys. Cloud Run needed updated envs and redeploys.
  • RAG quality: When retrieval failed, answers were vague. We added fallbacks, summary mode, and transcript‑based context to keep answers useful.

What we learned

As boring as it might be get data structures, database schemas agreed upon and dealt with at the start so you don't have to spend 2 hours trying to merge everyone's code at 2 in the morning, now it seems painfully obvious at the time less so.

Every potential solution you want to use for especially dev related most saas's have a generous free tier that you can play around with and get experience with real world tools, whether it be a Pinecone or Vercel or the whopping 300$ from Google Cloud.

What's next for EngageOS

This is a project we would love to see the university take on in some capacity either partnering with us or on their own. We built this as a proof-of-concept to show you how a solution like this can be useful for students and lecturers alike.

Built With

Share this project:

Updates