Tell Me A Story

Inspiration

While listening to the keynotes, we were struck by the idea of using technology to bring back voices that aren’t being heard. We all had fond memories of bedtime stories, but we realized that some parents and relatives must be separated from their little ones for periods of time because of work travel, incarceration, or other situations. How could we let these kids hear familiar voices tell them a story again, with the same or enhanced experience of flipping through a book together? And thus, Tell Me A Story (punny title: Once Upon a Twilio) was born.

What it does

There are two modes of operation for Tell Me A Story, freeform and karaoke. In freeform, the storyteller calls the application’s phone number, is prompted to invent a title, and begins weaving the tale. When they’re finished, they say The End to finish the story. On the child’s view, they go to the webpage and follow along as the words being spoken are written in an animated book (in the next version, with pictures being autogenerated based on content). This keeps kids engaged and helps them learn to read.

In the karaoke mode, the pair selects a well-known story to read through, with the book highlighting the phrases as the storyteller says them. If they have any saved stories from freeform mode, they could choose one of those, too. While we didn’t have enough time to build karaoke, we are confident in understanding the code that needs to be added.

Our intention is that the child also hears the storyteller through the phone at the same time they watch the story being written. This can be done by combining a Twilio conference call with the “gather” function to record and analyze speech, both features we have gotten working; however, Twilio’s API does not yet support combining the two.

How we built it

Tell Me A Story has a backend server built from Node.js and Express which handles routing requests to and from Twilio and providing the next bit of text for the book frontend. In Twilio’s API we use the gather verb extensively to 1) determine whether the caller is the storyteller (“speaker”) or listener and 2) record, process, and do speech recognition on the storyteller’s voice. Once the API supports gathering from conference call participants, we will also use the dial and conference actions.

The front end has a template written in Mustache.js which uses jQuery to send GET requests to our backend server for the next piece of text.

Challenges we ran into

Understanding what the Twilio VoiceRequest object was, figuring out our architecture and data pipeline, not being able to use Gather and Conference together, Twilio’s voice recognition having trouble with certain team member’s voices, any changes taking a few steps to test.