Welcome to Voice control based PDFViewer
Voice control based PDFViewer is a speech recognizable GUI written using python3 various modules, but heavily based on tkinter which lets you view PDF and image files and Speech Recognition library for all the speech based commands.
Inspiration to build this project
The main motivation for making this project was to provide a good reading experience for amputated people, who can control an application with just their voice. This is the targeted usecase for our project. The software is also targeted for the general tech savvy demographic, or simply for people lazy enough to use the mouse for controlling a PDF viewer :D
What we learnt
This was a wonderful experience for all of us. We got to learn a ton of new things, not only technical but also social aspects of software development, that include efficient team co-ordination, communication and most important how to overcome hurdles. In the technical aspect, we got learn software development principles by actually applying some software engineering practices like pair programming and agile workflow, We learnt about speech recognition and about different modules in python. Overall, this was a challenging, fun and a nurturing phase.
Roles and Responsibilities
Throughout making this project, each member played a key role, right from ideation, to design to developing. Pair programming technique was employed, where each pair worked on the part they had undertaken. Various aspects of this project were carried by all members, working as one under the guidance of the team leader. The roles played by each member is listed out as follows:
Yashdeep - Team lead, motivator, main developer, testing Shantanu - Co-developer, debugger, testing Sreehari - UI design, code cleaning, testing Vedant - Design, testing, documentation
Challenges faced
Right from ideation to development and deployment, we faced a lot problems. The problems faced can be categorized as:
Technical: In the development and testing phase, we encountered lots of bugs till we perfected the software and make it deployable. The biggest problem was faced during the integration of speech recognition module with UI design. Another technical challenge we faced was fixing the speech recognition functioning. A lot of other small bugs were frequently faced during the UI design part, but were quickly fixable.
Abstract: The biggest abstract challenge we faced was selecting an idea. Amongst a swarm of ideas, we had to choose an idea which was unique, complex and which could be used in the betterment of society. The next abstract challenge was faced in the design aspect of the project, as the UI had to be asethetically pleasing and yet simple.
These were some of the major challenges faced by our team.
Built With
- matplotlib
- pdfplumber
- pillow
- pypdf2
- python
- pyttsx3
- speechrecognition
- tkinter

Log in or sign up for Devpost to join the conversation.