TextAID — Project Writeup
Inspiration
We've all been there — staring at a 50-page textbook chapter the night before an exam, reading the same paragraph three times and still not retaining a word. Traditional reading is passive, and passive learning doesn't stick. We were inspired by the science of active recall and spaced repetition, and wanted to build a tool that transforms any document into an engaging, interactive learning session. The idea was simple: what if your textbook could talk back?
What it does
TextAID turns any PDF — a textbook, research paper, or story — into a fully interactive study experience. Upload your document and TextAID will read it aloud using lifelike ElevenLabs voices, so you can follow along hands-free. As you listen, it reinforces retention through active recall quizzes generated in real time from the content. It also produces smart summaries so the key ideas never slip through the cracks. And whenever you're confused or want to go deeper, an interactive AI tutor is ready to answer questions — fully grounded in your specific document.
How we built it
The frontend is built with React, providing a clean and responsive interface for uploading documents, controlling playback, and interacting with the AI tutor. The backend runs on Node.js and Express, handling PDF parsing, API communication, and session logic. We integrated ElevenLabs' API to convert extracted text into natural, expressive speech. For the AI features — quizzes, summaries, and the conversational tutor — we used the Gemini API, meticulously prompt-engineering it to provide us with useful data formatting.
Challenges we ran into
Our biggest challenge was figuring out the UI. Initially, We had great plans for our UX but our ambitions got ahead of us and we couldn't implement our dreams in time, so we decided to cut our losses with a simpler UI. Chunking PDF content intelligently also took some brainpower. PDFs come in all shapes and sizes — some with clean text, others with inconsistent formatting — so we had to build preprocessing logic to extract and clean the text reliably before passing it to the TTS or AI pipeline. Keeping the AI tutor grounded strictly in the document's content (rather than hallucinating outside knowledge) also required careful prompt engineering. Synchronizing quiz generation with the right moments during audio playback was another tricky piece to get right.
Accomplishments that we're proud of
We're proud of building a genuinely end-to-end learning tool in a short amount of time. The combination of TTS, active recall quizzes, and a document-aware AI tutor working together in one seamless interface felt like a real breakthrough moment. Hearing a dense research paper read back naturally and then immediately being quizzed on it — and actually retaining the content — validated the whole concept for us.
What we learned
We learned a lot about the challenges of working with unstructured document data and how much preprocessing matters before any AI or TTS model touches it. We also deepened our understanding of prompt engineering — specifically how to constrain an LLM to a specific knowledge domain to make it a trustworthy tutor rather than a confident guesser. On the product side, we learned that the best learning tools don't just deliver content — they create friction in the right places to force engagement.
What's next for TextAid
Our main priority right now is to host our project. Additionally, we want to imporve the UX on our platform. Features like scanning documents, website links, a large selection of readings, content analysis and maybe animation. On the learning side, we're planning to add flashcards and spaced repetition — so TextAID can resurface quiz questions at the optimal time for long-term retention. We'd also love to add better progress tracking across documents, and the ability to highlight and annotate sections during playback. Ultimately, we envision TextAID becoming a universal learning layer that sits on top of any content you consume.
Log in or sign up for Devpost to join the conversation.