Inspiration
Education is broken for millions of students who can't afford private tutors. We wanted to build something that feels like having a world-class tutor available 24/7 — one that listens, speaks, explains with visuals, and actually responds to you like a human would.
What it does
Eduverse is a real-time AI tutor with a lip-synced 3D avatar powered by Gemini Live API. Students can:
Talk naturally using their microphone — no typing required Interrupt the tutor mid-sentence — just like a real conversation See structured content appear in real-time — code blocks, math formulas, diagrams, and notes rendered beautifully on screen Watch the avatar lips sync to every word spoken The tutor explains any subject — math, coding, science, history — adapting to the student's pace
How we built it
Backend: Python with google-genai SDK connecting to Gemini Live API via bidirectional streaming. A WebSocket server bridges the browser and Gemini in real time. Frontend: Vanilla JavaScript using Web Audio API with a custom AudioWorklet for mic capture and PCM16 resampling at 16kHz. Audio playback uses scheduled AudioBufferSourceNodes for gapless streaming. Avatar: TalkingHead.js with a Ready Player Me 3D model rendered in Three.js. Lipsync is driven by Oculus viseme blend shapes timed to the audio stream. Tool calling: Gemini's function calling triggers show_content to push formatted markdown, code, and Mermaid diagrams to the student's screen panel in real time. Interruption: Implemented full bidi interruption — when the student speaks mid-response, the backend detects Gemini's interrupted signal, drains the audio queue, and the frontend immediately cancels all scheduled audio nodes. Infrastructure: Configured for Google Cloud Run deployment with Secret Manager for API key management.
Challenges we ran into
Knowing which model will be best for our case, exploring the model capabilities and trying to get its maximum. Linking the back-end and the front-end as it was my first time to deal with them in addition to learning how to deal with the Gemini API. Lip-sync of the avatar, to make it in real-time with the agent's response.
Accomplishments that we're proud of
Having a demo for a wrapped-up product that has many features which helps with my vision of the project and how it will help.
What we learned
Strong powers of Gemini Live API. How to make models call functions to show content that isn't outputted directly by the model. The Specific configurations of WebSocket cloud deployment.
What's next for Eduverse
- Adding the ability to share screen, upload images.
- The ability to store chats.
- The ability to generate media to help with illustration.
- Using the camera to be used with physical objects like troubleshooting or devices, fixing physical objects.
Built With
- css3
- gemini
- html5
- javascript
- katex
- mermaid.js
- python
- readyplayerme
- sdk
- talkinghead.js
- three.js
- webaudio
- websockets
Log in or sign up for Devpost to join the conversation.