Words to Waves
Words to Waves is a simple tool that converts text-based documents (like PDFs or images) into audiobooks. Whether you're visually impaired or just prefer listening over reading, this app makes it easy to turn any document into audio.
Inspiration
We wanted to create an accessible way for people to enjoy books, articles, and other text-based content by transforming them into audio files. The idea came from the growing need for tools that cater to different learning styles and accessibility needs.
What it does
You upload a document (PDF, image, etc.), and the app extracts the text and converts it into an audio file. You can then play the audio directly in your browser.
How we built it
Frontend: We used HTML and TailwindCSS for a clean, simple UI. It allows users to upload their files, click a button to start the conversion, and listen to the output.
Backend: The backend is built with Flask. It handles file uploads, processes the files, extracts text (using pdf2image for PDFs and PyTesseract for images), and converts the text to speech using gTTS (Google Text-to-Speech).
Challenges we ran into
Handling Different File Types: Processing various document formats (like PDFs and images) was tricky, especially with OCR (Optical Character Recognition) for images.
Accuracy of Text Extraction: Getting the text right from images was challenging, especially if the quality of the image wasn’t great.
Ensuring Smooth User Experience: We wanted to make sure the app remained fast and responsive, even with large files.
Accomplishments that we're proud of
Fully Functional Conversion: Users can upload any supported file, get it converted to audio, and play it immediately.
Great UI: The app is simple, modern, and works well on mobile and desktop thanks to TailwindCSS.
What we learned
Text-to-Speech and OCR: We gained valuable experience working with gTTS and PyTesseract for text extraction and conversion to speech.
File Handling in Flask: We learned how to efficiently manage file uploads and processing in Flask.
What's next for Words to Waves
We plan to improve text extraction accuracy, add support for more file types, and refine the user experience further.
Built With
- flask
- gtts
- html5
- javascript
- languages:-html
- ocr)
- pdf2image
- pytesseract
- python-frontend:-tailwindcss
- text-to-speech)
Log in or sign up for Devpost to join the conversation.