We were thinking that it would be a great skill for Alexa to be able to read lines or speeches to someone who was preparing for a play or needed to memorize a speech. We quickly realized that this skill would be even more useful if Alexa could read any form of text.
What it does
Story Time enables users to take photos from their iPhone, send them through project oxford for text extraction, and finally have Alexa read these "books" back to the user. Alexa can now read bedtime stories to children, novel excerpts to adults, and any pictures of text to any user. Or the children can listen to the novels and the adults to the bedtime stories. Whatever floats their boats.
How we built it
It starts with an iOS app extension that enables users to select any photos on their iPhone from the contextual photo menu. These photos are sent to Microsoft's Project Oxford API, which performs optical character recognition on the photos and returns the result. The photos themselves never even have to be hosted anywhere. The OCR results are parsed into sentences and pages of a "book" created from the selected photos. These pages are then uploaded to a Firebase blob for later retrieval by Alexa. When prompted, Alexa can read from any of the books the user has uploaded to their personal library.
Challenges we ran into
Learning how to work within Alexa's framework was challenging because neither of us had worked with ASK before or really knew the ins and outs of Alexa's capabilities.
Creating the iOS extension was the most challenging by far, though, because neither of us had used Swift before. It was challenging to make the non-trivial networking requests to APIs while trying to figure out which thread the requests needed to be on and how to parse the results. We're not sure we really like XCode.
Accomplishments that we're proud of
However, we overcame all those challenges (mostly through repeatedly banging our heads on the desk) and were able to make an end-to-end usable product! It has OCR! It has voice interaction! It can take a picture of several pages in a book and read it to you! This Alexa skill is seriously useful to myself and anyone else who has found that they want to continue reading a book while in the shower. Or a hot tub. Or while eating buffalo wings.
What we learned
We learned how to write iOS extensions, program in Swift, use Project Oxford, and make a useful Alexa skill. Neither of us had done any of that before. I had learned how to use Firebase just last week, but besides that, the tools were new to us.
What's next for Story Time
It would be natural to extend Story Time by providing more options to upload books to a library. PDF support would be a great addition. Alexa already supports reading from Kindle books and audio books. Story Time's extension of that to any text in any picture really rounds out her skills and utility in that department.