We want to increase the accessibility and exposure to reading in developing countries.
Because reading books and stories have tremendous impacts on cognitive and linguistics development which can be difficult to access for the illiterate.
We did it using GPT language model for story generation through a simple no-frills web application.
What it does
TattleTale, is a web application which aims to improve reading accessibility to the illiterate. Unlimited and fully-accessible stories for children generated using GPT language model. Proof-of-concept for stories generation: https://ta-ttletale.herokuapp.com/
How we built it
The project can be segmented into two parts - Web application and GPT model (Click on the links to direct to the github repo).
Web application Simple web application is being designed and created using React. Cloud services (AWS in our Proof-of-concept) are used to handle requests for input and output to and from the GPT model.
GPT model training To use GPT-2 model, we first scraped for training data on children bedtime stories using Beautiful Soup through https://www.studentuk.com/category/bedtime-stories/ and https://www.tonightsbedtimestory.com/stories/. Fine-tuning was done on pre-trained model obtain children-friendly stories.
Challenges we ran into
- Lack of relevant data for model fine-tuning.
- Model perplexity relatively high.
- AWS Integration with React Web-app and the GPT model. Attempt to tackle these challenges were made and they were successfully addressed to produce the proof of concept (POC). However, we believe we can improve further especially in the model finetuning and UIUX.
Accomplishments that we're proud of
- Successfully created a functional POC.
- Generated children stories successfully using the GPT-2 model.
- Seamless collaboration within the group.
What we learned
- The GPT model and transformer model in general for Natural Language Processing (NLP) task.
- Simple web-scrapping ability
What's next for TattleTale
While there are many new features TattleTale can dive deep in, the highest priority lies in the key features which aims to fulfill our goal-improve reading accessibility to the illiterate. These key features include:
- Text-to-audio function. Other features we are looking to enhance TattleTale also include:
- GAN image generation to pair with stories generated.
- Dictionary function where each word used can be looked up to learn the meaning and pronunciation behind it.
- Non-English language.