Inspiration

As students, we are often overwhelmed by the sheer volume of information—from handwritten lecture notes and 50-page PDFs to hour-long YouTube tutorials. While AI tools exist, they often "give the answer" rather than "teaching the concept."

Inspired by the need for a more interactive and Socratic approach to learning within the JKUAT community and beyond, I built EchoLearn. I wanted a tool that doesn't just summarize content but transforms it into a personalized, interactive classroom using the power of Gemini 1.5 Flash.

What it does

EchoLearn is an AI-powered study companion that transforms static learning materials into a dynamic, interactive "Study Suite."Multimodal Uploads: Converts messy whiteboard photos, long PDFs, or lecture recordings into structured data using Gemini 1.5 Flash.The Smart Digest: Generates concise summaries with technical formulas rendered in LaTeX (e.g., $\Delta H = \Delta U + P\Delta V$).Interactive Flashcards: Automatically creates a personalized deck for active recall based specifically on the uploaded content.Socratic AI Tutor: Features a built-in mentor that guides users through tough concepts with analogies and leading questions rather than just giving answers.Persistent Library: Uses Firebase to save sessions, allowing students to track their progress and revisit their history across devices.The Impact: It turns passive reading into active learning, making complex subjects accessible to every student with a single click.

How we built it

EchoLearn is built on a modern full-stack architecture designed for speed and scalability:

Frontend: Built with React and Tailwind CSS, focusing on a clean "Blue & White" professional aesthetic. I used Lucide-React for iconography and Framer Motion for smooth transitions between study tabs.

Backend: A Node.js/Express server acts as the secure gateway, handling file uploads via Multer and communicating with the Google Generative AI SDK.

AI Engine: We utilized Gemini 1.5 Flash for its massive context window and multimodal capabilities. It processes PDFs and images (like whiteboard photos) simultaneously.

Database & Auth: Firebase handles Google Authentication and stores processed study sessions in Firestore, allowing students to access their history anytime.

Challenges we ran into

Parsing Complex PDFs: Initially, converting multi-column academic papers into a coherent summary was difficult. I solved this by leveraging Gemini’s native PDF processing rather than trying to extract text manually.

Asynchronous Flow: Waiting for a 20-page PDF to be processed by an LLM can lead to timeouts. I implemented a real-time progress state in React to keep the user engaged.

State Persistence: Ensuring that a user’s "Study History" loaded instantly was a challenge. I optimized this by caching the Gemini JSON response in Firestore, so the AI doesn't have to re-process the same file twice.

Accomplishments that we're proud of

Zero-Friction Multimodal Processing: We successfully implemented a pipeline where Gemini 1.5 Flash seamlessly handles everything from handwritten whiteboard photos to complex PDFs in a single interface.

Structured Intelligence: We are proud of our "JSON Bridge"—a robust Node.js middleware that forces the LLM into a strict schema. This ensures our React frontend always renders clean, interactive flashcards and summaries without formatting errors.

Mathematical Precision: By integrating LaTeX support, we ensured that technical students (Engineering, Math, Physics) get accurate, high-fidelity formulas rather than broken text strings.

Full-Stack Persistence: We didn't just build a demo; we built a platform. Integrating Firebase Auth and Firestore means a student’s hard work is saved, creating a persistent "Knowledge Library" that grows with them.

The "Tutor" Persona: We successfully fine-tuned our system instructions to move beyond a "chatbot" feel. EchoLearn acts as a Socratic mentor, focusing on how to think rather than just what the answer is.

Clean UX/UI: We take pride in the "Blue & White" aesthetic. Using Tailwind CSS, we created a professional, accessible interface that minimizes cognitive load and keeps the focus on learning.

What we learned

Through this project, I deepened my understanding of Multimodal Prompting. I learned that Gemini performs significantly better when given structural constraints. By utilizing JSON Mode, I was able to bridge the gap between unstructured AI "chat" and a structured React UI

I also gained hands-on experience in managing high-resolution file uploads between a Node.js server and Cloud Storage.

What's next for Echo Learn

In the future, I plan to add:

Collaborative Study Rooms: Using Firebase Realtime Database to let JKUAT students study the same material together.

Voice Interaction: Integrating Web Speech API so students can "talk" to their notes.

Local Language Support: Fine-tuning the tutor to explain complex concepts in Swahili to increase accessibility in the region.

Share this project:

Updates