Inspiration
This project started with something very familiar to all of us: the feeling of being overwhelmed while studying. We realized how often students jump between PDFs, YouTube videos, lecture recordings, handwritten notes, and dozens of different apps. Learning wasn’t actually the problem—managing the learning process was.
At the same time, new multimodal AI models were becoming powerful enough to understand text, images, audio, and structure. That sparked the idea: What if we could combine all these capabilities to create one platform where everything a student needs is in one place?
EduVerse grew directly out of this question.
How We Built It
EduVerse brings two major systems together under one platform:
AI Tutor System
This system turns static materials into something interactive. We used Gemini Vision to read PDFs, understand their structure, and extract explanations, diagrams, and core ideas. From there, the AI tutor can generate summaries, quizzes, flashcards, and even walk through concepts step-by-step. One of our goals was to make explanations feel grounded, so the tutor highlights the exact line it’s referring to, helping the student connect the explanation to the material.
Classroom Whisperer
Lectures are often where students lose the most information, so we built a tool to capture them as they happen. Using real-time transcription from Gemini’s audio models, we built a system that listens to a lecture, transcribes it, and then organizes it into summaries, topic breakdowns, and searchable notes. The idea was to make sure no important moment from class disappears the moment it’s spoken.
Technology
We built the frontend using React 19, TypeScript, and Tailwind, and used Supabase for authentication, storage, and our relational database. All AI processing—whether audio, vision, or text—runs through the Gemini API. Zustand manages state across the application, and Vite helped us keep development fast.
What We Learned
Throughout the build, we learned a great deal about combining different AI modalities into one consistent experience. We explored how vision models interpret dense academic PDFs, how audio transcription behaves in real time, and how to chain model outputs so they feel natural to the user.
We also learned how important structured data is—especially when working with generated content. Designing a database that could handle projects, files, lectures, flashcards, summaries, quizzes, and voice conversations taught us a lot about scalable architecture. Finally, making the experience smooth for the user pushed us to think more deeply about UI design and state management.
Challenges We Faced Real-Time Processing
Achieving smooth, real-time lecture transcription while coordinating the UI and backend was more difficult than expected. Latency and consistency were constant challenges.
PDF Understanding
Academic PDFs are rarely clean. Some have formulas, others have scanned pages, unusual formatting, or missing structure. Getting vision models to consistently extract useful information required many iterations.
Bringing Vision, Audio, and Text Together
Each modality has a different rhythm. Vision models produce slow but detailed output, audio requires fast streaming, and text generation fills in the reasoning. Making them work together without feeling disjointed took several redesigns.
Maintaining Consistency Across Generated Materials
Flashcards, summaries, quizzes, and slides all rely on slightly different prompts and data formats. Keeping them consistent—and reusable—was harder than it seems.
Data Modeling
Connecting everything in the platform—projects, lectures, learning materials, assessments, analytics—meant designing a relational structure that would remain clean as features expanded.
Final Thoughts
EduVerse came from a simple frustration: learning shouldn’t feel so fragmented. Building this platform taught us how powerful multimodal AI can be when it’s used to bring structure and clarity to something as personal and complex as learning.
Our goal wasn’t just to build another study tool—it was to create an environment where students can focus on understanding, not organizing. In that sense, EduVerse represents what we believe the future of learning can look like: unified, adaptive, and genuinely helpful.
Log in or sign up for Devpost to join the conversation.