Inspiration

As a Cameroonian CS student juggling lectures at HITBAMAS and endless PDFs, I often felt like I was drinking from a firehose. One evening, I thought! Why don't I create something that can help me in summarizing, quizzing and giving me an audio playback of my notes so I can at least get something out of my busy schedule or even when I am bored: When I learnt about Google Gemini’s Vision API and ElevenLabs’ TTS API, and given the opportunity that was given to me in this hackathon, I immediately went for SmartStudy—my very own AI study buddy that feels less like grinding and more like chatting with a friend who can read my notes to me, summarize it and then give me quiz.

What it does

SmartStudy takes your bulky reading material and turns it into an engaging study session:

  • Smart Document Processing: Simply drag‑and‑drop your PDFs, images, or text. In seconds, Google Gemini OCR cleans it up and splits it into neat chapters.
  • AI Summaries: Hit “Summarize” and get bullet‑point overviews—no more skimming for hours. Confidence scores tell you how solid your summary is.
  • Interactive Quizzes: Auto‑generated quizzes (multiple‑choice, fill‑in‑the‑blank, true/false) that adapt to how you perform. It’s like having a tutor in your pocket.
  • Natural Voice Narration: Let ElevenLabs read aloud with real‑time highlighting—perfect for when you’re on a moto‑taxi or making plantains.
  • Reading Analytics: Track your speed, progress, and comprehension. Finally see your gains in black and white.
  • Smart Recommendations: Based on what you’ve studied, SmartStudy suggests the next PDF or chapter to tackle—no more “What do I read now?” moments.

How I built it

  1. Stack & Style

    • Next.js 14 App Router for smooth transitions and fast loads
    • TailwindCSS + shadcn/ui in a minimalist dark theme with blue accents—easy on the eyes during late‑night cram sessions
    • Zustand to handle state (quiz scores, TTS position) without messing up React hierarchies
  2. Core Modules

    • Document Processor: PDF.js feeds raw text to Google Gemini Vision for OCR and chapter detection
    • Summaries & Quizzes: Serverless functions in /app/api call Gemini Text, with custom middleware for quiz logic
    • TTS Reader: ElevenLabs SDK used for narrating documents.
  3. Data, Auth & Deploy

    • Neon Postgres stores user sessions, stats, and embeddings
    • BetterAuth manages secure sign‑in with email or social logins
    • Vercel hosts the app and functions, with Lighthouse CI catching performance drops on each PR

Challenges I ran into

  • Messy OCR: Gemini misread headers, footers, and page numbers. I built a preprocessor to strip margins and clean up the text first.
  • Highlight Drift: TTS playback and scroll offset refused to line up. The fix? Throttle scroll events and anchor highlights by character index.
  • API Rate Limits: Hitting Gemini and ElevenLabs quotas meant users could run out of credits. I added local caching for summaries and audio clips so you don’t burn through API calls.
  • Responsive Headaches: Getting the reader UI to feel right on a 5″ phone and a 27″ monitor took dozens of layout tweaks. -Couldn't claim my custom domain name on Ionos: Due to the fact that my country was of residence was listed among the list of countries when it was time to collect my info.

Accomplishments that I am proud of

  • Zero‑Downtime Deploys: Neon Postgres and Vercel pipelines keep everything live, even when I push hotfixes at midnight.
  • Happy Testers: Early friends say they slice their study time in half and remember 85% more with SmartStudy.
  • Tanstack Query Caching: Caching with tanstack query improved on the ui and ux more than I expected.

What I learned

  • Next.js Patterns: Splitting server and client components is the secret sauce for speed.
  • Zustand Simplicity: Managing global state without prop‑drilling feels like a breath of fresh air.
  • Cache Wisely: With AI APIs, plan for rate limits from day one—caching saved me from surprise bills.
  • ** Elevenlabs TTS**: Using elevenlabs tts api for generating audio.
  • Google Gemini Image Understanding: Learnt how to configure google gemini for extracting text from documents, analyzing the text, creating summaries and generating real quizzes.

What's next for SmartStudy

  • Study Hubs: Virtual rooms where Cameroonian students can join live quizzes and share notes.
  • Gamification: Badges, streaks, and leaderboards—turn study sessions into friendly competitions.
  • Shared Annotations: Real‑time collaborative highlighting and note‑taking among classmates.
  • Offline Mobile App: Download chapters, listen and quiz offline, plus timely push reminders when exams approach.
  • Real time text highlight: Implement real time text highlight for the reader.
  • Improve UI/UX: Improve the ui and ux on mobile devices and medium devices.

SmartStudy is just the beginning. My goal? A vibrant, AI‑powered learning ecosystem that makes studying feel less like work and more like hanging out with your smartest friend. Stay tuned!

Built With

  • elevenlabs
  • google-gemini
  • nextjs
  • prisma
  • tanstack-query
Share this project:

Updates