Inspiration
Learning today is still built around screens — long videos, dense textbooks, and endless scrolling.
But most real learning happens between moments: while walking, commuting, cooking, or resting.
We noticed something interesting: people willingly listen to podcasts for hours, but struggle to stay engaged with traditional learning platforms.
That question sparked Vibe Learn:
What if learning felt as effortless as pressing play on Spotify?
Instead of forcing learners to sit down and focus, we wanted to build a system that lets learning happen naturally — through conversation, audio, and curiosity.
What it does
Vibe Learn is an AI-powered, audio-first learning platform that turns any topic into podcast-style conversations.
Users can:
- Generate AI podcast episodes on any topic instantly
- Listen through radio-like learning stations
- Follow structured learning paths with modules and progress tracking
- Talk to real-time AI tutors through text or voice
- Use live camera-based vision AI to analyze and learn from what they see
- Convert conversations directly into playable learning episodes
In short:
Learn anything by pressing play.
How we built it
We designed Vibe Learn as a full-stack, real-time AI system.
Core technologies:
- Next.js + TypeScript for the frontend experience
- FastAPI (Python) for backend orchestration
- Google Gemini 2.5 Flash for structured content generation
- Gemini Multi-Speaker TTS (Director Mode) for natural podcast-style audio
- LiveKit for real-time chat and voice interactions
- OpenAI GPT-4o for conversational tutoring agents
- Overshoot Vision SDK for live video understanding
How generation works:
- User selects a topic, station, or learning path
- Backend generates a structured dialogue using Gemini
- A Director-style prompt controls tone, pacing, and emotion
- Multi-speaker audio is generated in real time
- Episodes are streamed directly to the player
We carefully engineered prompt templates, modular pipelines, and real-time agents so the system could support casual listening and structured education at the same time.
Challenges we ran into
- Designing prompts that consistently return clean, structured JSON
- Coordinating multiple AI systems (LLM, TTS, agents, vision) in real time
- Handling audio latency while keeping the experience smooth
- Managing state without a database during rapid MVP development
- Ensuring conversations naturally convert into meaningful learning content
Balancing technical depth with user simplicity was one of the hardest — and most rewarding — parts of the build.
Accomplishments that we're proud of
- Built a fully working audio-first learning experience
- Implemented multi-speaker AI podcast generation
- Integrated real-time AI tutors (text, voice, and vision)
- Created both casual discovery (stations) and structured learning (paths)
- Delivered a true end-to-end system within hackathon time constraints
Most importantly, we built something that feels genuinely different from traditional education tools.
What we learned
- Learning is more engaging when it feels conversational, not instructional
- Audio dramatically lowers the barrier to education
- Real-time AI interaction increases curiosity and retention
- Clear system design matters just as much as model quality
- The best products combine experience + engineering, not just AI demos
This project taught us how to think beyond “using AI” and instead design new learning interfaces around it.
What's next for VibeLearn.ai
Next, we plan to:
- Add user accounts and cross-device sync
- Move audio storage to cloud infrastructure
- Introduce transcripts and highlights
- Improve personalization and recommendation systems
- Launch mobile apps for true on-the-go learning
- Explore partnerships with educators and institutions
Our long-term vision is simple:
Make learning as natural and accessible as listening to music.
Built With
- css
- fastapi
- next.js
- pydantic
- python
- react
- tailwind
- typescript
- uvicorn
Log in or sign up for Devpost to join the conversation.