One of our teammates, Sunjana, has a brother who has autism. As she watched him attend speech therapy classes, she noticed that he and kids like him had trouble holding simple conversations with others, mainly because they didn't know how to respond to seemingly simple questions like "What's your name" and "What did you do last weekend". As a result, they lost valuable opportunities to make friends and develop relationships with others. Sunjana's brother was fortunate enough to be able to attend classes that helped him overcome these struggles. However, not all kids with autism can afford to do so, which limits their chances of developing adequate social skills. We believe everyone deserves the ability to be able to connect with others, even if they need to be taught how.
What it does
Our project, ConvoCoach, is targeted towards users who fall in-between the middle-functioning to the lower-functioning range of the autism spectrum, who don't know how to respond to basic casual questions. It does this by guiding the user through a simple conversation. There are two "characters", "girl" and "coach", both voiced by the computer (using female and male voices respectively). The girl asks a question like "What's your name" or "What did you do last weekend?". The user is then able to answer back. If they answer inappropriately, for example, the girl says "What did you do last weekend" and the user responds "pasta", then the coach interjects, tells them what they did wrong, and tells them to try again, asking them the question again. If they answer with an appropriate response, but it's one word, the coach will prompt them to give a more complete response, like "I did swimming", instead of just "swimming". When the user gives a complete, appropriate response, the coach praises them and the girl talks some more, either repeating the question (to ensure the user actually knows how to answer it, and wasn't just parroting the coach) or moving along with the conversation.
How we built it
We used Google Cloud's Text-to-Speech, Speech-to-Text, and Natural Language Processing APIs to build our product. We also used pyaudio and pygame to play computer audio.
Challenges we ran into
Our unfamiliarity with APIs was the main hurdle and we ran into a lot of technical issues over the course of the hackathon. Our project was dependent on several APIs from different sources, and while our various modules worked independently when we tried to combine them, there were unexpected conflicts and fatal consequences. As a result, we ended up not being able to include all the features that we had worked on in the final project, though we were successful in integrating most of them.
Accomplishments that we're proud of
We're proud of the fact that we were able to integrate Google Cloud Text-to-Speech, Speech-to-Text, and Natural Language Processing into our project, especially since our computers were having issues with all three of those APIs. We're also proud of the fact that our project addresses a real issue in the autism community, specifically among those in the middle-to-lower-functioning range of the autism spectrum, that technology hasn't really addressed before.
What we learned
We have learned a lot regarding APIs such as Azure and Google Cloud APIs, specifically implementing them and making sure they could be integrated together into one file, as we had to understand them in order to implement most of our project's features.
What's next for ConvoCoach
We plan to integrate a visual element into ConvoCoach, where images would appear of what the girl and the user were talking about, while they were talking. For example, if the girl was talking about how she went swimming, and the response from the user needed to be action-oriented, an image would appear of the action word in the girl's speech (in this case an image of "swimming"). No matter how the user responds, an image would appear of their response, so that they can visually see what they're saying.
Scotty Lab's Scott Prize