Children who know English are more likely to perform better in school as they grow up, but there are often households where English is not spoken at all. We made an application that engages children in English while helping them learn about their surroundings and how to respond to certain questions in conversation.
What it does
Parentese is an app that uses image recognition to identify objects in a child's surroundings and tell the child what the object is. Once it tells the child what the object is, it asks the child a question about the given object and waits for the child to respond, giving an appropriate return response to the child's answer.
How we built it
We used the Google Cloud Vision API to analyze an image captured from a constant video stream and return the major object in the scene. We then used IBM Watson Text-to-Speech to tell the child what the object is and to ask an associated question about the object. Our application would listen for the child's response and then use IBM Watson Speech-to-Text to get a written form of what the child has just said. Finally, we use a trained IBM Watson Conversation service to take the child's speech and provide an appropriate response about what the child has just said about the object in question.
What's next for Parentese
We hope to expand Parentese to engage the child in deeper conversation, likely with further training of the IBM Watson Conversation service. Parentese also has the potential to be a mobile app or an embedded application in a toy that the child could carry around and converse with on a daily basis, making it more convenient and engaging in the process of learning English.