Inspiration

A desire to create an app that is fun and educational(mainly for the young and non-native speakers) that can used in daily life, while incorporating recently learned skills in classes and research.

What it does

After several photos are taken by the user, the photos will be uploaded to our google cloud server, using a deep learning model to automatically generate a story based on the contents of the pictures. Keywords are also taken from the story and have their definitions provided by the Oxford Dictionary API.

How we built it

Used React Native for the front end and Python for the Google Cloud Platform(flask), model(pytorch), and vocabulary list(Oxford-Dictionary API). We also used the Nvidia CUDA package for GPU accelerated computing with the GCP.

Challenges we ran into

There were many issues trying to make the deep learning model work on the GCP environment due to package incompatibilities. Also, dealt with the model producing extremely inaccurate stories, partially due to the small data set and short period of time for learning.

Accomplishments that we're proud of

That we’ve succeeded in making a functioning app with the features we planned at the first night in the span of the length of the hackathon. We’re excited that we have a prototype for a product that we can actually end-to-end demo.

What we learned

We learned how to brainstorm and implement features, while efficiently splitting work among teammates based on their skills and specialities to make a complete product in a very short period of time.

What's next for StoryTime

The sequences pictures will actually be connected in their relationship to form a complete story. (Data training set was separate images with their own respective captions) On this note, we are also interested in incorporating the watson visual recognition service for greater accuracy in the content of the stories.

We will add audio to tell what the words are and how they are pronounced which is also helpful for some disabilities.

We will also add the ability to highlight objects in the photo when we select words from the vocabulary, to clarify what the objects.

Share this project:

Updates