The Inspiration: Mitigating Cognitive Overload
Reading shouldn’t be an act of endurance. As students utilizing the UF Disability Resource Center, we recognize firsthand the barriers that inaccessible text creates in academic and medical settings. High-quality accessibility features are often locked behind paywalls or complex software. We built ReadAble to shift the burden of adaptation from the user to the interface, ensuring that everyone, regardless of diagnosis or financial situation, can understand and engage with information.
The Solution: A Multi-Sensory Co-Pilot
ReadAble converts complex documents and PDFs into dyslexia-friendly, plain-language summaries through three core pillars:
Bimodal Learning: Our system implements synchronized audio-visual reinforcement. By integrating the ElevenLabs API with a custom timing algorithm, the app highlights words in yellow exactly as they are spoken, bridging the gap between seeing a word and processing its meaning.
Adaptive Complexity: Using Google Gemini 1.5 Flash, we perform text simplification. Users can select reading levels from Grade 5 to Adult, instantly re-rendering text to match their specific cognitive needs.
Universal Design: We’ve ensured that language barriers and document formats never stand in the way of knowledge by offering Global Translation and a robust PDF Upload feature.
The Blueprint: System Architecture
We built a full-stack application designed to handle generative media and complex data persistence with high efficiency.
Framework and Frontend
The core application is built with Next.js, utilizing the App Router for a unified frontend and backend experience. The UI was styled with Tailwind CSS, featuring high-contrast modes and dynamic font scaling to ensure accessibility from the ground up.
AI and Data Storage
We integrated Gemini 1.5 Flash for generative text simplification, using JSON Mode to ensure structured data output for study quizzes and medical glossaries. For data persistence, we used MongoDB Atlas with Mongoose to allow users to save summaries to a personalized library.
Orchestration
Custom React hooks manage the asynchronous state between our translation APIs, the speech engine, and the visual UI. We utilized cookies for session management, allowing registered users to access private data while guests utilize core features immediately.
The Hurdles: Syncing and Precision
Engineering Bimodal Learning: Achieving high-precision synchronization between generative audio and the UI was a major hurdle. Since we use external APIs for both text and speech, we developed a custom timing algorithm that maps audio duration to word count to ensure the "follow-along" highlighting remains accurate.
Structured Data Integrity: Ensuring that Gemini consistently returned valid JSON for our complex schemas (including quizzes and glossaries) required significant prompt iteration and robust error-handling logic.
Zero to Atlas: Learning and deploying MongoDB Atlas under a 36-hour deadline was a steep climb. Transitioning to a NoSQL mindset was a complete shift in how we thought about data, but it ultimately became the backbone of our library persistence.
Accomplishments We’re Proud Of
Context-Aware Intelligence: Successfully implementing Dynamic Prompt Engineering that shifts the AI’s tone and output based on whether the user is a student or a medical patient.
Full-Stack Inclusion: Building a working pipeline that transforms raw PDFs into multi-sensory, multi-lingual, and multi-level experiences in milliseconds.
Accessible Multi-Sensory Design: Delivering a polished, high-contrast interface that combines visual, auditory, and interactive elements within a high-pressure hackathon timeframe.
The Takeaway
We learned that accessibility is not just a "feature," it requires a fundamental shift in how you structure data and user flows. Using MongoDB taught us the power of schema flexibility, allowing us to evolve user profiles in real-time. We also realized that age and context affect everything, from phonological awareness to professional vocabulary; our app is designed to grow alongside the user.
The Horizon for ReadAble
Image-to-Text (OCR): Integrating OCR capabilities so users can snap a photo of a physical textbook or brochure and simplify it instantly.
Reading Analytics: Developing a dashboard to help users and educators track comprehension growth and identify specific areas where a reader might need extra support.
Shared Learning: Creating a feature where students at the DRC can share simplified study guides, fostering a collaborative support network.
Built With
- css
- elevenlabs
- gemini
- html
- mongodb
- next.js
- pdf2json
- typescript
Log in or sign up for Devpost to join the conversation.