For as long as humans have existed, new ideas and creativity have been driving forces like no others. In today's society, there are so many people and so many great ideas that it can be difficult to push your thinking outside of the norm and actually think of something original. Whether it's practical, disruptive or incredibly weird, a new idea can be incredible useful, rewarding and most of all fun. PictoStory generates idea seeds from a photo. These small snippets act as a unique source of inspiration. PictoStory is good at thinking outside the box, because it isn't bounded by very much!

What it does

PictoStory allows the user to upload any photo to the site for further analysis. Using Microsoft's Computer Vision API and Emotion API, Pictostory can mold otherwise random phrases to be associated with attributes of a photo. PictoStory's default state randomly generates the structure of a sentence using a basic context free grammar (CFG) for the English language, then assigns nouns, verbs, adjectives etc to the missing components based on relations to the attributes in the photo. For example, based on the emotion readings of an image, more positive or negative adjectives can be selected to complete a sentence. Pictostory also features two custom options. First, there is the "gangster" option (click the square CD button in the top right to toggle on and off). This option is essentially basic mode with an initial greeting and final conclusion in less formal English. Second, there is the "template" option (click the rail track button in the top right to toggle on and off). This options doesn't generate the sentences randomly by computing the branches of a CFG, but rather has a group of stored sentence formats to model off of. This mode often yields more dramatic and sensible statements, so be sure to check it out. Some of the template phrases used for the model are original while others are based of of famous lines from Shakespeare's plays.

How I built it

I used ReactJS to build the front end of PictoStory. Client-side javascript functionality provides all of PictoStory's functionality in addition to handling interactions with Microsoft's Computer Vision and Emotion API. I used my github account to host the site. (

Challenges I ran into

I ran into some pretty annoying bugs while making PictoStory, especially when dealing with modifying and storing images. Some of them seemed impossible at first, but with enough digging I luckily found solutions to every major problem I encountered and I know I'm better off because of it! Overcoming these bugs was very rewarding.

Accomplishments that I'm proud of

I'm very proud of building an initial working program, and then iterating over and over again while tweaking existing code and implementing knew functionality. At the beginning I didn't really have a solid idea of what I was going to create. I just worked through one idea at a time. There was never a shortage of ideas or improvements, but only a shortage of time. This is the first time I've ever stayed up all night without sleeping at all!

What I learned

I learned that programming for long periods of time can be incredibly rewarding if you are passionate about what you're doing. I look forward to more experiences like this and hope to find a career one day where I feel the same way.

What's next for PictoStory

Next, I think it would awesome to look into algorithms and techniques for better forming AI speech. This could expand PictoStory to be even more legible, although is quirkiness is very enjoyable at times. Lastly, I'd like to try it out for 5 minutes, and record a good idea that it gives me. Having tested this out many times, I know PictoStory can be incredible wacky (especially in standard mode) and will never fail to surprise.

Built With

  • microsoft-computer-vision-api
  • microsoft-emotion-api
  • react
Share this project: