Inspiration

The saying goes “A picture can talk a thousand words…” AI can generate text and also images How about AI generating visual stories?

What it does

It takes short text as input and generates an image as output

How we built it

We used 3 models:

  1. GPT-2: To generate a story from short phrases
  2. BERT Summarization model: To summarize the paragraph and get exact keywords.
  3. BigGANxCLIP model: It was used to generate an image from the keywords.

Challenges we ran into

  1. To select an appropriate model to generate images from various genres
  2. The image quality was not as expected.
  3. We required to use GPU and CUDA. So it was difficult to run on the local machine. hence we used Google Colab

Accomplishments that we're proud of

Attempt to create first visual AI story generator model as per our best knowledge

What we learned

GAN, Neural Networks, BERT NLP model

What's next for AI stories generator

To decrease the number of iterations for generating an image To try with other text to image generators like DALL-E

Built With

  • bert
  • python
  • spacy
  • torch
  • torchvision
  • transformers
Share this project:

Updates