StoryScape

StoryScape is GenAI powered Interactive Storyteller platform, which will convert boring textual content or taboo topics to visually appealing comics/manga.

Demonstration of the Project

  • Click on this below image for playing video

IMAGE_ALT

Problem Statement

  • Nowadays students face problem due to low attention span which is less than a gold fish.

    • Gold fish attention span: 9 sec
    • Humans attention span: 8 sec
  • Also, It is very hard to spread awareness about topics which are considered โ€œtabooโ€ in our society such as periods, superstations, sex education, etc.

  • As per studies conducted in US by NCBI, suggested that approx. 65% of the population are visual learners, So learning from textual content leads to

    • Difficulty in Conceptualization
    • Reduced Retention
    • Limited Engagement
    • Difficulty in Problem-Solving
    • Limited Creativity and Expression
    • Increased Cognitive Load

Our Solution

  • Our idea is to build Gen AI powered platform, which will convert boring textual content or taboo topics to visually appealing comics/manga.

  • User can specify plot & characters of the storyline or just enter a topic and it will generate a comic book as per their comic style i.e. Marvel, DC, Disney Princess, Anime etc.

  • Our platform will utilize text-to-image transformations using Stable Diffusion.

  • We will optimize the pipeline to generate a comic under 30-50 secs using Multithreading & Caching database (Redis).

Features Offered

  • [X] Generate in your favourite comic style i.e. Marvel, DC, etc
  • [X] Ability to set custom characters and story plot
  • [X] Generate Shareable comic link or share comic pdf
  • [X] Enhances user experience with realistic animations simulating page turning and book opening/closing, creating an immersive digital reading environment.
  • [X] Ability to create vernacular(Hindi/English/Tamil/etc) comics

Sample Comics Generated By Our Platform

  1. Dragon Tale
  2. Vampire Story
  3. Naturo preparing for exam

StoryScape : Two Models

  1. Comics-Dialogue-Generator ๐Ÿ“
  2. Comics-Scenes-Generator ๐Ÿ’ฌ๐Ÿค–

Comics-Dialogue-Generator ๐Ÿ“

  • This code snippet demonstrates the utilization of Intel Neural-Chat Text Generation model, leveraging a pretrained model from Hugging Face.
  • Facilitating the generation of comic dialogues based on textual prompts.
  • For Creating High quality comic scene images, we are Generating dynamic image prompts for specifying minute details about the comic scenes, these dynamic image prompts are created using neuralchat by supplying comic scene dialogue.
  • By loading the model onto the available device along with our custom post processing code, the script efficiently processes the input prompt and produces comic dialogues in a Json format.
  • Notably, running this code in Google Colab takes lots of time, but leveraging Intel's CPU or XPU helps us reduce the generation time in few seconds. ๐Ÿš€
  • We have used NeuralChat (which is a Intel Mistral 7B Optimised model) for its blazing fast speed and high accuracy

user input example image dialogues and character extraction for comic

Prompt : Funny Cindralla story in Disney Princes style

Notebook Link : Click Here

Comics-Scenes-Generator ๐Ÿ‘ค๐Ÿš€

  • This code implements an image generation model using Stable Diffusion optimised by IPEX and Intel OpenAPI run on Intel Developer Cloud (IDC).
  • The model is designed to generate visually appealing comic scenes.
  • The Intel Developers Cloud XPUs helped in reducing the time of inference, and the optimized PyTorch for Intel Hardwares helped us in reducing the overall time for comic scene generation. ๐ŸŒ๐Ÿ–ผ๏ธ๐Ÿค–๐Ÿ’ช

IPEX Optimised Stable Diffusion | Normal Stable Diffusion

Usage of oneAPI and Intel Developer Cloud ๐ŸŒ๐Ÿ’ป

Utilizing the resources provided by Intel Developer Cloud significantly expedited our AI model development and deployment processes. Specifically, we harnessed the power of Intel's CPU and XPU to accelerate two critical components of our project: Comics Dialogues Generation and Comic Scenes Generation. ๐Ÿ’ปโšก

  1. Neural Chat: fine-tuned by Intel 7B parameter LLM on the Intel Gaudi 2 processor from the mistralai/Mistral-7B-v0.1 and run on intel_extension_for_transformers performed exceptionally well compared to other tested models of the same family - Mistral 7B

Comparison Graph

Intel Optimised Neural Chat vs Normal Mistral Comparision

  1. Text-to-Image Generation: Text to Image generation using Stable Diffusion using IPEX on Intel Developers Cloud vs normal Stable Diffusion run on Kaggle

Comparison Graph

Comparison between time took in Intel Developers Cloud using IPEX and Kaggle

In summary, leveraging Intel Developer Cloud's advanced CPU and XPU technologies, using their Intel Extension For Pytorch (IPEX) and their Intel Extension For Transformers significantly accelerated our model and inference time and project's development. ๐Ÿš€๐Ÿ•’

IPEX Optimised Stable Diffusion | Normal Stable Diffusion

Flow Diagram ๐Ÿ”„๐Ÿ“Š

  1. User will login with google auth & will get redirected to main dashboard.
  2. User will enter Topic(required), Comic style(optional), Story plot(optional) & Characters (optional).
  3. After hitting enter, web application will run a celery worker for generating a comic
  4. User will be redirected to waiting page where he will get info about the progress.
  5. Once comic is generated, user will be redirected to comic viewer
  6. Comic viewer will have options to download the comic in pdf format or share the web comic viewer link.

Architecture Diagram

Architecture Diagram

Technologies Used ๐Ÿ› ๏ธ

  1. Backend - Flask: Our application's backend was constructed using Flask, a versatile Python web framework. Flask facilitated the development of RESTful APIs, user authentication, data processing, and integration with machine learning models efficiently and swiftly. ๐Ÿ๐Ÿš€

  2. Machine Learning Models: Our app utilizes advanced machine learning models developed with TensorFlow, PyTorch, and Hugging Face Transformers for intelligent features like comics dialogue and scene generation with custom characters. ๐Ÿค–โš™๏ธ

  3. Other Technologies: In addition to React, Flask, and machine learning models, our application utilizes a range of other technologies to enhance performance, security, and user experience. These include:

-   **Celery:** Comic Generation usually takes more than 30 secs, which can leads to 502 Gateway error, so we've implemented Celery Worker by which the comic generation pipeline will be executed on server.
-   **Redis** It is used as Broker & Caching Database to boost the performance & also used in developing flask api for showing comic progress on fronted (Loading Page)
-   **Intel Developer Cloud:** Leveraging Intel's high-performance CPU and XPU capabilities, we accelerated model training and inference processes, reducing processing time and improving overall performance. โšก๐Ÿ’ป

How We Built It ๐Ÿ› ๏ธ๐Ÿ‘ทโ€โ™‚๏ธ

  • User inputs a topic with story plot
  • The Input Text is then given to Intel Mistral Optimised version (Neural Chat)
  • Raw Comic Dialogues Text is then parsed using Post Processing functions which returns the result in JSON Format
  • For Generating Comic Poster, a Dynamic Image Generation prompt is generated using Neural Chat by supplying comic topic.
  • Dynamic Image prompt is then given to Image Generation Model (Intel Optimized Stable Diffusion)
  • Similarly for generating comic scenes, a dynamic image prompt is generated using neuralchat by supplying Comic Scene Dialogue
  • Then that dynamically generated prompt is used to generate Comic scenes using Stable Diffusion
  • Multithreading is used for parallel image & text generation
  • Once Comic Images are generated, we write the text on top of image using OpenCV in comic font
  • Then Finally we merge the images using custom Image List to PDF generator code
  • Celery worker runs the above task & updates the Redis db for saving the progress
  • We have created a Flask-Restful api which is connected to redis for fetching the progress
  • Loading page calls this api every 2 seconds and shows the progress on the page
  • Onces the api status is completed, the page automatically loads the Comic viewer
  • Comic viewer has the functionality to either read the comic on website itself using realistic page turn animation & providing immersive comic reading experience
  • Also at the end of comic page, there is a option to download the comic in PDF format

Use case of Intelยฎ Developer Cloud (IDC)

  • The platform utilizes Intel's Image Generation API hosted on IDC (Intel Optimised Stable Diffusion) to transform the textual content into visually captivating comic panels
  • The web application triggers the Intel Text Generation API hosted on IDC (Intel's Neural-Chat) to generate story script based on inputs.

Hackathon PPT

Installation

# Install Redis
sudo apt install redis-server nginx python3-pip -y
sudo systemctl start redis-server
sudo systemctl enable redis-server
sudo service redis-server status 

# Install VirtualEnv
pip3 install virtualenv

# Clone Project
git clone https://github.com/PushpenderIndia/StoryScape.git

# Navigate to folder
cd StoryScape

# Create Virtual Environment
virtualenv venv

# Activate Virtual Env.
source venv/bin/activate

# Install Requirements
pip install -r requirements.txt

Run in Terminal - 1

python app.py

Run in Terminal - 2

celery -A app.celery worker --loglevel=info

Built With

Share this project:

Updates