Picture of Specto glasses

Specto

About

Today, the technology exists as a support for all people, it has helped make our traffic signals work, allow us to use the internet, and even goes as far as powering our toaster at home. Our inspiration was the supporting role that technology plays in society today. With this project, we plan to expand that support to help a new category of people.

The project has 2 protocols, the blind protocol which is able to help the visually impaired user with visualizing the world around him through his other senses. The user at any time could ask the specto to take a picture. Specto could take a picture and will tell the user all the objects in the area and if there is any text in front of a user, like a book. It would read out the text for the user. Specto also has the ability to alert the user if there is a wall or object in front of the user, to guide the user through the world. Specto's deaf protocol is a web app that is able to detect any audio in the real world and turn it into text for the user to understand. The user could type back and it would turn in to audio for the other person to understand. These two protocols help support both blind and deaf people through their everyday life.

This project has multiple parts and was split up by each of the members in our team. We built the blind protocol using infrared sensors to tell the user when there is an object in front of them. The computing for the blind protocol happens inside the raspberry pi. We used the Pi Camera so when we ask Specto to take a picture it would take it. And the image to text algorithm and the detect multiple objects program would run to insight the user on what is in front of him or her. For the detect multiple objects program we used the Google vision API to make it work, and for the image, to the text, we used a python library known as pytesseract. To take a picture the user says "Specto take a picture" which could be detected using the speech recognition library. For the deaf protocol, we used the speech recognition library again to help deaf people communicate with someone else. The speech recognition detects the other person's voice and turns it into text for the other person to understand. The deaf person could speak back by typing on the web app and using gTTS we turn that text into speech.

Challenges

A few challenges we had was trying to get the service key for the JSON file for the google vision API to run on the python code, and after that, we face a challenge that only was able to detect multiple objects in the demo file, but we just had to change a few minor details to make it work for any image file. For the Image to speech for the blind protocol, we faced a few problems with trying to play an mp3 file using playsound because the pi was not able to find the mp3 files.

What we learned

We learned how to make an organized project with 2 protocols that work together to solve a single problem.

What's Next?

We are in the process of adding a feature for users with hearing disabilities who want to use our platform. Functionality to text back and play responses into audio is coming soon.

Because of social distancing, we had to tackle this problem separately, so the finished form of this project is a bit separate, but after this pandemic, we plan to meet and connect the parts of the project to make it more unified

Additionally, we would like to improve on the Vision API and making the voice to text more reliable.

Installation and other info

Additional information can be found here: GitHub ReadMe

Built With

css
google-cloud
google-vision
gtts
html
infared-sensors
javascript
pi-camera
pygame-mixer
pytesseract
raspberry-pi
rpi.gpio
speech-recognition
tensorflow

Submitted to

LA Hacks 2020
- Winner LA Hacks Honorable Mention

Created by

I researched the data and statistics needed for the project. I also contributed to the website and made many edits to the writings.

Shanay Champaneri
I worked on building the specto glasses which is part of the blind protocol. I worked on the hardware and the software, such as the image to speech algorithm and created an alert if you come close to a wall using infared sensors...

Sahil Tallam
I worked on the implementation of the G-Cloud Vision API for the blind protocol component of this project. Utilizing the power of AI and text to speech, I was able to turn a picture into audio.

Samarth Shah
I worked on the entire research for this project to identify the problem and how we can make it better. As well as creating the website for the projects to be displayed on.

Agrim Dhingra
Finance @ UT Austin | Passion for Startups
I worked on building the product that deaf people can utilize to interact with others in society. It is amazing how accurate voice recognition has become!

Ishaan Bansal

Updates

Ishaan Bansal posted an update — Apr 05, 2020 09:57 PM EDT

An updated version of our website has been made. Additionally, new explanation videos have been shot to reflect the entire project. All the updated information and videos can be found here: https://specto-lahacks.netlify.com/

Log in or sign up for Devpost to join the conversation.

Sahil Tallam started this project — Mar 28, 2020 09:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.