Inspiration
As a Computer Science student at the University of Karachi, I realized that modern textbooks are often dense, static, and overwhelming. Students need more than just a list of answers; they need an interactive companion that adapts to their learning pace. I was inspired to build StudyBuddy AI to bridge the gap between complex text and visual understanding, creating a "Smart Tutor" that can explain anything from primary school math to PhD-level research papers.
What it does
StudyBuddy AI is a privacy-first, 100% client-side educational toolkit powered by the Gemini 1.5 Flash model. PDF Contextual Learning: Users upload textbooks, and the AI provides answers based strictly on the document’s context. Visual Concept Mapping: It automatically converts complex logic into multi-dimensional Mermaid.js diagrams, such as flowcharts and mindmaps. Real-time Voice Assistant: Features a contextual voice tutor that can explain specific steps of a solution or parts of a diagram out loud. Mathematical Precision: Renders intricate formulas using LaTeX to ensure academic clarity. BYOK (Bring Your Own Key): Users provide their own API key, ensuring total data privacy and zero server-side logging
How we built it
We utilized a high-performance modern tech stack to ensure a seamless experience: Frontend: Built with React 18 and Vite for optimized speed and performance. UI/UX: Designed a glassmorphic dashboard using Tailwind CSS and shadcn/ui. AI Engine: Integrated Google Gemini 1.5 Flash directly via the client-side SDK for low-latency responses. Visuals & Math: Implemented Mermaid.js for dynamic diagrams and KaTeX for high-quality LaTeX math rendering. Persistence: Used Supabase to manage document libraries and chat history.
Challenges we ran into
One of our biggest hurdles was overcoming persistent 404 Model Not Found errors caused by API version mismatches. We solved this by implementing a robust auto-model selection and fallback logic that automatically shifts to the fastest available Gemini model. Additionally, ensuring that the voice assistant remained contextually aware knowing exactly which "line of a formula" the user was asking about required sophisticated prompt engineering and state management.
Accomplishments that we're proud of
We are incredibly proud of achieving a Pure Client-Side architecture. By allowing users to bring their own API keys, we’ve created an app that respects user privacy and functions independently of a central backend. Successfully integrating real-time voice, LaTeX rendering, and complex visual diagrams into a single, cohesive interface is a milestone we are very happy with.
What we learned
During this hackathon, we learned the true power of multimodal AI. We mastered how to prompt LLMs to output structured data (like Mermaid code) for immediate visual rendering. We also deepened our understanding of managing complex application states without relying on a traditional server-side proxy.
What's next for Study Buddy
The next step for StudyBuddy AI is to introduce specialized "Subject Modes" and more interactive 3D visualizations using libraries like Three.js. We also plan to explore decentralized resource sharing via protocols like AetherNode, bringing educational tools into the Web3 ecosystem while maintaining our commitment to user privacy.
Built With
- browserlocalstorage
- framermotion
- googleantigravity
- googlegemini1.5flash
- googlegenerativeaisdk
- katex
- mermaid.js
- netlify
- pdfjs-dist
- react18
- tailwindcss
- typescript
- vite5
- webspeechapi
Log in or sign up for Devpost to join the conversation.