Title: Fantasy Visualizer

Who: Yuanbo Li (yli581), Minghao Liang (mliang21), Zhen Ren(Zren18)

Introduction: We are doing an image generation task, it is something new. The inspiration comes from reading novels: when we read novels, we try to visualize the scenario described in the work. What is the middle earth described by J.R.R Tolkien looks like? How would the White Witch from The Chronicles of Narnia cast her magic? And we want the computer to output this image for us.

Related Work: 1) Seq2Seq Structure for text summarization 2)Diffusion Model for visualization 3)Style Transfer (just two convolutional model...)

Data: The biggest challenge we faced is finding datasets to train our text summarization model. We want to have a dataset containing the paragraphs and their summarization. Yet after hours and hours of working, we cannot find a perfect dataset for fantasy novels.

So we tried the two strategies: 1) We manually prepared a small dataset by reading and summarizing text by ourselves. 2)We are using is to use the dataset from CNN news stories

Methodology: Text Summarization(Seq2Seq) + Image visualization (OpenAi Api)

Metrics: 1)Text Summarization: We use Rouge-n to measure the number of matching words between the model-generated text and the reference.

2)Image Generation: Since we treat the diffusion model as a black box, we don't measure the loss of the image Generation

3)Style Transfer: We use to convolutional model to calculate the loss of the generated worked w.r.t both the input and the style image. And add the loss together.

Ethics: Why deep learning is good: Deep Learning is a great way to boost our imagination. It can visualize images that human cannot.

Who are major stakeholders: If our model is great, we can use it to help future film makers. It might give them inspirations. It would not be a big problem even if the output is not so accurate. This is because they film makers must have closely read the novels themselves, and they can distinguish if our app is giving them a good idea or not.

Built With

  • deep
Share this project:

Updates