People with special needs often struggle with pronouncing words properly and stringing words together into sentences. We were inspired by one of our team member’s experiences watching her younger brother, who has autism, learn how to read. Her mother would guide her brother through the words in a sentence, making sure he was saying each properly. Often, he would read the first word in the sentence correctly, but when he tried to read the second word, he would immediately forget how to read the first, causing reading entire sentences to be extremely difficult for him.

What it does

The program guides special-needs individuals through reading a simple sentence. It spells out each word in the sentence, making the user repeat the word and allowing them to move on to the next word only if they said the previous one correctly. The program also combines two-three words together and has the user work through this smaller phrase, so that he/she can learn to put words together instead of merely parroting back individual words based on how they’re spelled. Finally, the program prompts the user to say the entire sentence, and concludes when the user says it correctly.

How we built it

Our program is built using Python. We use the Google Cloud Text to Speech API to spell out the words and phrases. We also use the Google Cloud Speech to Text API to parse what the user said and determine whether they said the right word or phrase or not. Furthermore, we used PyAudio to record the user’s responses, and Pygame to play the computer’s responses as the program was printing the text in Terminal.

Challenges we ran into

The first challenge we ran into was trying to use an API called PocketSphinx to decode the phonetic sounds in the words; we were planning on using this to correct the user’s pronunciation, but PocketSphinx was difficult to use and integrate with the rest of our code, so we switched to Google Cloud API’s and worked around the phonetic sounds problem. It was also difficult to integrate the Google Speech-to-Text and Text-to-Speech APIs together into one program because we had problems with setting up credentials and figuring out how to make the calls to both APIs. We also had to figure out a way to give positive reinforcement and encouragement even when the user said the word or phrase incorrectly and how to keep them from getting frustrated if they couldn’t say the word.

Accomplishments that we're proud of

We’re proud that we were able to integrate API’s that we had never worked with before into our program. We are also proud of the fact that we learned how to use PyAudio and Pygame, and that we were able to implement the features we had originally planned to, despite running into challenges with choosing and using API’s. We’re also proud we managed to find an API that would serve these purposes after our first one failed.

What we learned

We learned not only how to use Google’s text-to-speech and speech-to-text APIs, but also how to use PyAudio to record the user’s input and use PyGame to play the computer’s audio. We also learned how to use API’s in specific ways and how to use multiple different API’s together.

What's next for BitRead

We’d next like to create a graphical display of the words that highlights the word(s) the user is supposed to say, so that the user can see the whole sentence and track which word(s) they’re supposed to say. this would also include music and other encouraging visual and auditory aids. We’d also like to add a game portion to the app so that the user is encouraged to keep playing and practicing, as well as implement a pronunciation correcting system where the user starts by learning the syllables in each word, then the word, then a couple words, then the whole sentence, thus allowing them to develop a connection between different combinations of letters and their respective sounds. Furthermore, we would like to include a space for users’ teachers/family members to input phrases they think would be most useful for the user to learn to read. Eventually, we would like to deploy this program as a web application.

Built With

  • google-cloud-speech
  • google-cloud-texttospeech
  • python
Share this project: