As a kid, I loved sharing bedtime stories with my family, where we would get creative and make up the wackiest stories. I wanted to take this fond experience to the next level, by making these stories appear before your eyes.
What it does
StoryLook collects audio as the user is reciting a story, and provides visuals pertaining to detected "entities" from the story, all in real time! We believe that StoryLook will help children improve their speaking and comprehension while also fostering their creativity!
How I built it
We heavily relied on Google Cloud APIs to power our application. We used Speech-To-Text to convert story audio into a searchable format, Natural Language Processing to detect and label entities mentioned in the story, and even Translate to support multiple languages!
Challenges I ran into
We had to make major tradeoffs between accuracy and response time. In order to convert speech to text more accurately, we required more time. If we tried to give responses as fast as possible, too much noise would be picked up. We went through several iterations before finding a sweet spot between the two.
Accomplishments that I'm proud of
We're proud of our support for multiple languages! We also think our final product's UI looks pretty neat.
What I learned
What's next for StoryLook
Beyond just tweaking the accuracy of image searches, we want to generate shareable content so that stories can be saved/shared to be cherished forever!