Inspiration
During my final exams, I found myself constantly juggling between my PDF textbooks, handwritten notes, and random flashcard apps. I’d scroll through endless pages just to find that one definition, only to lose track of where I was. I wished there was a way to turn my study material into something smarter, something that didn’t just sit there like a static document, but actually helped me learn.
That’s when the idea struck. What if AI could not only read my textbook but also break it down into bite-sized summaries, generate flashcards instantly, quiz me on the spot, and even let me semantically search across the entire content?
Gemini Scholar was born out of that frustration, a personal itch that turned into a broader vision. And when I came across Data Hackfest, with its focus on data-driven solutions and intelligent transformation, it felt like the perfect place to build and showcase this idea.
What it does
Upload PDFs (lecture notes, research papers, books) Get detailed, structured summaries Automatically generate flashcards from the content Perform semantic question-answering (e.g., “What is machine learning?”)
How I built it
Frontend: Streamlit with custom HTML/CSS and animation enhancements for a sleek, layered UX.
Backend: Used Gemini Pro (via Gemini API) for summarization, flashcard generation, and MCQ creation Integrated Groq with llama3-70b as a fallback for robustness and speed
PDF parsing: PyMuPDF (fitz) to extract and clean structured text
Hosting: Deployed on Streamlit Cloud
Challenges I ran into
Handling PDFs with inconsistent formatting or scanned images Keeping the Gemini output within context and avoiding hallucinations Making sure all UI layers are preserved and don’t overlap destructively
Accomplishments that I'm proud of
Built a fully functional AI-powered study tool within the hackathon window Designed a layered UI that feels modern and intuitive Enabled interoperability between Gemini and Groq, ensuring fallback reliability Converted boring PDFs into interactive learning modules
What I learned
How tricky PDF parsing can get, and how to normalize it Prompt engineering for multimodal outputs like summaries + quizzes + flashcards How small UI/UX changes massively improve learning experience Using large models like llama3-70b in fallback systems like Groq for scalable AI solutions
What's next for Gemini Scholar
Interactive Quizzes Add support for scanned/image-based PDFs using OCR Introduce audio summaries for on-the-go learning Add collaborative study rooms where peers can share flashcards/quizzes Integrate a spaced repetition algorithm to make flashcards more effective over time Launch a Chrome extension to summarize any online document in one click

Log in or sign up for Devpost to join the conversation.