illustrAItor

What inspired us

Reading is a community issue that is directly linked to education and equitable opportunities that come through being more informed. With the current media landscape, it can be hard for many people to pick up a book over a new show on Netflix. This is especially true for people with dyslexia, young children, and second language learners. For them, reading carries pain points that make it difficult for them to concentrate and engage with the material.

What it does

Our project makes reading more enjoyable and complimenting with customizable AI-generated illustrations. Users can pick a book excerpt to be fetched by our application and choose an illustration style. illustrAItor then creates captivating illustrations to complement the passage of text being read.

How we built it

For this project, we used Python and Tkinter for our GUI. We used Dream Studio’s Stable Diffusion Image Generation API to generate our illustrations. We also used Figma to demonstrate where we hope to go with this project in the future.

Challenges we ran into

We started off this project very ambitiously by first attempting to make a React.js application connected to a Python application through a backend framework (Flask), which none of us were knowledgeable on. This didn’t work out very well and resulted in hours of walking in circles.

There were a ton of other features we had hoped to implement at the beginning of the day that was simply not realistic for us to implement. A large part of our plan going into this project was to use an API to fetch the text and other meaningful data from books and classify the text. We also planned to use IBM’s Watson Natural Language Processor to "digest" large chunks of text into short summaries that were more usable for the Image Generator. However, given the time constraints, we instead chose to hardcode text classification to feed our application reliable prompts.

Accomplishments that we're happy about

We want everyone to feel the joy of getting lost in a good book. The application is certainly entertaining for us, and with more iteration, we could design something useful for the general public.

We are all first-year (first-quarter) transfer students at the University of Washington, and none of us have ever attended a hackathon. It has been insanely fun to work with these guys and I can’t wait to see what projects we can create together in the future. We all challenged ourselves to complete this application and pushed each other until the end to reach our goal, despite our collective lack of experience.

What we learned

Anything is possible with enough free energy drinks and a time crunch.

We also learned how to delegate tasks and the importance of planning out what we want to build before building it. At the beginning of this hackathon, we were thinking too many steps ahead (like trying to deploy a GitHub page in a language we ended up not even using) before even considering the best means of moving forward with the project. Toward the end, we learned how important it is to ideate and make sure everyone is on the same page before moving forward with the project.

What's next for illustrAItor

The first thing we can do in the coming months is clean up our UI and make our project run more efficiently. Not all of us were very familiar with Python from the start, and there was a bit of a learning curve for us to implement our project this way. In the long term, this app might function best as a web app or a mobile app, and it might be better to port our project to a more front-end-friendly language entirely. One of the implementations we wanted to include was the ability to extract a large amount of text and summarize it to become prompts. These summarizations could then be processed IBM’s Watson’s NLP Text Classification API to find short strings providing character descriptions, landscapes, actions, and other visually interesting data that could help the Image Generator API create more consistent, compelling illustrations. Doing this would make it possible for us to autonomously create illustrations for any book or passage of text. We could either pull books from a database or the Google Books API. We could even give users the option to input their own text and generate pictures from that. This would make the creative potential for users virtually limitless. Furthermore, this project could also be implemented to go beyond the scope of what we hoped to accomplish. If we provide support for languages other than English, this application could provide additional accessibility to non-English speakers, and also provide a novel means for English speakers to study another language.

Built With

figma
python
stablediffusion
tkinter

Submitted to

DubHacks '22
- Winner Community Track Finalist

Created by

Designed our proof of concept, wrote the script for our presentation, and found keywords that produced consistently quality illustrations when fed into the Stable Diffusion API.

Joey Laurent Krueger III
Developed the method to make AI generated illustrations from book prompts and made an object diagram to follow. Implemented the Stable Diffusion API. Helped make the Python front end.

Elias Belzberg
Engineered the backend system using Python to effectively parse book data stored in JSON files and integrate it with Stable Diffusion API to produce stunning visual representations of literal text.

Henry She
Donovan Clay