Inspiration

The seed for Read2Me came from a simple observation: the world is overwhelmingly designed for people who can comfortably read long blocks of text. Whether it’s university course packs, dense research reports, government forms, or policy PDFs, much of society’s most important information is locked away in text-first formats. Yet a huge number of people are excluded by this: individuals with low vision, dyslexia, ADHD, traumatic brain injury, or even something as common as eye strain after long hours of screen use. Beyond disability, there are situational barriers too—commuters who cannot look at a screen, language learners who absorb information more easily through listening, or multitaskers who need text read aloud while they do other tasks.

We also noticed that existing text-to-speech solutions often fail on three counts: (1) Accessibility barriers: they are locked behind paywalls, require log-ins, or involve clunky setups. (2) Privacy concerns: most tools force users to upload documents to third-party servers, raising serious issues when the documents contain sensitive material like medical records, legal contracts, or student essays. (3) Poor PDF handling: many readers work fine with raw text but break down when faced with PDFs, which are one of the most common document types in education, government, and business.

Our inspiration was to build something that removes all three barriers: a zero-setup, entirely client-side, privacy-preserving tool that works not only with copy-pasted text but also directly with PDFs. We wanted it to be fast enough for a hackathon demo yet robust enough to be useful for everyday life. The design process was anchored in inclusive design and the WCAG 2.2 accessibility guidelines, with a commitment to inclusive defaults: keyboard controls, clear feedback, sentence highlighting, and full customizability of speed, pitch, and voice.

What it does

Read2Me is an accessibility-first web application designed to make reading more inclusive. Its key features include: Text and PDF reading: Users can paste text directly into the app or upload a PDF. The app extracts text client-side and reads it aloud in real time. Customizable narration: Playback can be tailored through adjustments to voice, speed, and pitch, supporting a range of preferences and needs. Sentence-level highlighting: As text is read, the current sentence is visually highlighted, helping users follow along, reinforcing comprehension, and improving focus for those with attention-related challenges. Accessible controls: Play, Pause, Resume, and Stop functions are mapped to intuitive keyboard shortcuts (Space/S/R), with on-screen buttons designed for high contrast and clarity. Screen reader integration: An aria-live region provides real-time announcements (“Paused”, “Reading sentence 3 of 12”), ensuring that users relying on assistive technology get clear, structured feedback. Privacy-preserving architecture: All processing happens entirely in the browser—no uploads, no servers, no accounts—so user data never leaves the device.

Who benefits: This tool is designed for a wide audience. It directly supports people with visual impairments, dyslexia, or ADHD. It also benefits language learners who want to hear pronunciation, professionals who prefer to listen to documents while commuting, and students who want to absorb material through multiple modalities. By making digital text multimodal, Read2Me reduces barriers and opens access to information in a way that benefits both disabled and non-disabled users.

How we built it

Stack & Architecture React + Vite + TypeScript provided a modern, modular framework that allowed us to rapidly prototype while keeping the codebase clean and maintainable. Web Speech API powered the text-to-speech functionality, enabling us to offer adjustable rate, pitch, and voice options, leveraging whatever voices are available on the user’s system. pdf.js (pdfjs-dist) handled client-side PDF text extraction, ensuring privacy and reliability without requiring server-side processing. Tailwind CSS gave us fast, accessible styling with built-in high-contrast design patterns, ensuring visual clarity.

Key Implementation Choices

Sentence segmentation: Instead of reading paragraphs in one chunk, we split text into individual sentences using a regex-based approach. This created a more natural rhythm, made pausing/resuming more intuitive, and allowed us to highlight sentences in sync with narration. Playback queue: Each sentence was queued into the speech synthesizer sequentially. We used onend callbacks to trigger the next utterance, ensuring smooth transitions and reliable state updates. Accessibility considerations: We implemented keyboard shortcuts for all major functions, added screen reader announcements via aria-live, and visually emphasized the currently spoken sentence. These micro-level details ensured the app was inclusive, not just functional. Performance & privacy: Everything runs locally. We avoided backend infrastructure altogether, making the tool lightweight, faster to deploy, and inherently private. Dynamic imports minimized initial load time and ensured pdf.js workers loaded correctly across environments.

Challenges we ran into

Browser inconsistencies in voices: The Web Speech API is implemented differently across browsers and operating systems. Some voices ignore pitch or speed settings, while others require explicit user gestures to start audio. Autoplay policies: Modern browsers block speech without interaction to prevent spam. We had to design around this by guiding users to click or press Space before playback begins. pdf.js worker configuration: Getting pdf.js to work seamlessly with Vite required careful configuration, including fallback to a CDN-hosted worker to avoid bundler issues. Sentence splitting edge cases: Abbreviations (“e.g.”), lists, and non-Latin scripts sometimes broke the regex-based sentence segmentation. We tuned the logic to be forgiving, though it remains an area for refinement. Scanned PDFs: PDFs without a text layer (i.e., image-only scans) returned little or no text. Without OCR, this limits usability, but we scoped OCR integration as a next-phase feature.

Accomplishments that we’re proud of

Building a fully client-side reader that respects privacy and requires no accounts or internet connection beyond the initial load. Implementing robust pause/resume with synchronized sentence-level highlighting, giving users a clear sense of progress and control. Designing with accessible defaults, such as high-contrast UI, keyboard shortcuts, screen reader announcements, and clear instructional copy. Packaging the entire project for one-command deployment on GitHub Codespaces or local environments, lowering the barrier for developers and testers. Creating something that is practical and usable beyond the hackathon—not just a prototype, but a tool that people can actually rely on.

What we learned

Accessibility is holistic: It’s not just about adding features like text-to-speech; it’s about designing micro-interactions—focus states, predictable shortcuts, feedback loops—that make the experience reliable and comfortable for all users. PDFs are messy: PDF as a format varies widely, and text extraction can be inconsistent. OCR integration is not optional if we want a universally useful tool. Inclusive design helps everyone: Adjustable speed and pitch support disabled users, but also benefit multitaskers, second-language learners, and anyone who finds reading tiring. Privacy can be a feature: Keeping everything on-device reassures users and makes the product more trustworthy. This aligns with growing global emphasis on data protection and user autonomy. Hackathon pace requires scope discipline: We learned to focus on one core use case—text and PDF reading—and build it really well, rather than spreading thin across too many features.

What’s next for Read2Me

OCR integration: Add support for scanned PDFs using tesseract.js or offline machine learning models, so image-only documents can also be read aloud. Cloud TTS options: Provide opt-in integration with services like Azure, Polly, or Google Cloud TTS for higher-quality, more natural-sounding voices, and add audio export (MP3). Multilingual support: Implement language auto-detection, refine segmentation for CJK and right-to-left scripts, and improve handling of mixed-language documents. Advanced highlighting: Move from sentence-level to word-level highlighting, and generate captions (SRT/VTT) for study, transcription, or video use. Dyslexia-friendly features: Offer toggles for OpenDyslexic fonts, increased line spacing, and customizable color schemes that meet contrast guidance. PWA and browser extension: Enable offline use and add one-click “Read this page/PDF” integration directly in the browser. Educational and enterprise integration: Add features like progress tracking, note-taking, and LMS integration, while ensuring compliance with institutional privacy policies. Accessibility audits and user testing: Conduct systematic WCAG reviews and involve diverse users in iterative testing to ensure the product truly serves its audience.

Built With

Share this project:

Updates