Problem Statement:
We chose to tackle theme 2 problem 3.
Solution Overview:
Many schools don’t have sufficient resources to cater their materials to the diverse needs of students with disabilities. Hence, we developed Accessible EdTech, an all-in-one app that makes course materials accessible, with features like sign language interpretation, text-to-speech translation, auto closed-captioning and dyslexia-friendly pdf formatting.
Students can navigate the user-friendly interface to find the service that they need. Then, they can upload the file that they want to convert into the app, which will return the desired output to the user.
Text-to-Speech The Text-to-Speech (TTS) feature enables users to convert textual content from uploaded documents into spoken audio. This enhances accessibility for users with visual impairments, reading difficulties, or those who simply prefer auditory learning. It currently supports ‘.txt’, ‘.pdf’, and ‘.docx’ files, but future work includes allowing other formats of text, especially on existing Learning Management Systems (LMS).
How It Works
- File Upload: Users upload a document via a form on the TTS page.
- Text Extraction: The backend extracts readable text using format-specific parsers:
- PyPDF2 for PDFs
- python-docx for Word documents
- UTF-8 decoding for plain text files
- Speech Generation: The extracted text is converted to speech using the gTTS (Google Text-to-Speech) library.
- Audio Playback: The generated .mp3 file is saved and rendered on the frontend with an HTML5 audio player.
Real-time Transcription The Real-Time Transcription (RTT) feature provides users with live conversion of spoken audio into written text. This functionality supports accessibility for individuals who are deaf or hard-of-hearing, and enhances usability in educational, professional, and collaborative environments. Key Capabilities
- Live Speech-to-Text: Captures spoken input and transcribes it instantly.
- User-Friendly Interface: Simple and intuitive layout for starting and viewing transcriptions.
- Language Support: Compatible with multiple languages depending on the speech recognition backend.
Dyslexia-friendly PDF Formatting: The user can upload a pdf file into the convertor, which will return a pdf file with a simpler font (Helvetica), larger font size (14), wider word and line spacings, and a beige background. This feature aims to make the content more readable for people with dyslexia and people who struggle with visual processing.
Text to Automated Sign Language The user can input text into the text field, which the software will first translate into gloss. This process removes filler words like “the”, “is”, etc, structuring the text into a compact form. The gloss is then translated word-for-word into a mp4 video file of the corresponding sign language, which is hence concatenated together and returned to the user as a full video.
Please note that the above feature is not included in our video due to connection issues, however, the relevant code is still available in the GitHub repository.
Technical Implementation: Tools, frameworks, libraries, APIs, and hardware used.
Architecture The application is built using the Django 5.2.3 framework, structured around modular views and templates to support accessibility-focused educational tools. It includes features such as Real-Time Transcription, Text-to-Speech Conversion, and Dyslexia-Friendly Formatting. The app is designed to be scalable, asynchronous-ready, and user-friendly.
Core Technologies at a glance Backend Framework: Django Frontend: HTML5, Bootstrap, JavaScript Asynchronous Support: Django Channels + Daphne Speech & Text Processing: gTTS, PyPDF2, python-docx, faster-whisper Document Handling: PDFMiner, pdfplumber, python-docx Media Processing: moviepy, soundfile Deployment Ready: ASGI-compatible with Daphne
Feature-Specific Implementation
Text-to-Speech (TTS)
- Libraries: gTTS, PyPDF2, python-docx
- Workflow:
- User uploads a document.
- Text is extracted based on file type.
- gTTS converts text to .mp3.
- Audio is served via Django’s media system.
Real-Time Transcription (RTT)
- Frontend: Web Speech API (JavaScript)
- Backend: Django Channels (for future server-side transcription)
- Optional Server-Side Model: faster-whisper with ctranslate2 for Whisper-based transcription
Dyslexia Formatting
- Libraries: pdfplumber, PyPDF2, docx
- Workflow:
- User uploads a document.
- Text is extracted and reformatted with dyslexia-friendly styles.
- Output is downloadable.
Asynchronous & Real-Time Capabilities
- Django Channels: Enables WebSocket support and asynchronous views.
- Daphne: ASGI server for handling async requests.
- Twisted + Autobahn: Support for real-time networking and WebSocket communication.
Development Process:
We brainstormed ideas together on day 1 to come up with concrete features that we want to include in our solution. Afterwards, we split the backend development amongst our team members. Once that was finished, we came back to integrate the different features into one cohesive front-end dashboard that contains all the different functionalities.
Challenges & Learning Points:
We are quite new to the development of apps, hence this hackathon proved to be a challenge for us as we had to figure out how we can utilize our existing coding knowledge into practical uses. Since we worked on the different features separately before merging them into one integrated dashboard, we ran into problems trying to resolve the merge conflicts, which took up a lot of time. A takeaway would be to be more structured with what language and frameworks we want to utilise as we ran into some issues with integrating the back-end features that we developed separately into a single front-end dashboard.
Future Improvements:
In the future, we plan on expanding the accessibility functions to cater towards a wider range of disabilities. Some features include keyboard navigation and eye tracking support for those with motor impairments, and
To make the experience of using more seamless, we also plan on integrating this app into online learning platforms as an extension that will automatically convert all the content on the online learning platform into the accessible versions. This makes the learning experience much more streamlined as they no longer need to manually upload each material onto the convertor.
Built With
- attrs
- autobahn
- automat
- bootstrap
- cffi
- constantly
- cryptography
- ctranslate2
- daphne
- django
- docx
- faster-whisper
- flatbuffers
- gtts
- imageio
- imageio-ffmpeg
- joblib
- moviepy
- nltk
- numpy
- packaging
- pdfplumber
- protobuf
- pyasn1
- pyasn1-modules
- pygame
- pyopenssl
- pypdf2
- pypdfium2
- python-docx
- python-dotenv
- regex
- requests
- setuptools
- soundfile
- twisted
- txaio
- webspeechapi
Log in or sign up for Devpost to join the conversation.