Coding Cupids - Love Letter Generator

Title

Who

Manuel Quezada (mquezad1), Samantha Gundotra (sgundotr), Jose Urruticoechea (jurrutic), Juan Pablo Ramos Barroso (jramosba)

Introduction

For our project we want to build a love-letter generator to revive modern romance. Users will input their recipient and we will generate a love letter (chunk of text) tailored to them. Our project falls into Natural Language Processing (NLP).

Related Work

How I Build an AI Poetry Generator

This paper uses OpenAI’s GPT-2 pretrained language model to train their own text prediction model. To turn it into a poet, the author fine-tuned the model by feeding it a database of modern poems.

GPT-2 Simple

This GitHub repository is a simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifically the "small" 124M and "medium" 355M hyperparameter versions). Additionally, this package allows easier generation of text, generating to a file for easy curation, allowing for prefixes to force the text to start with a given phrase.

Data

We plan on combining a dataset of love letters and text from romantic books from different time periods to incorporate old-fashioned english.

Collection of Love Letters — Kaggle

TheRomantic.com Love Letters

Poems Dataset — Kaggle

Collection of Famous Love Letters — The Romantic

Potentially use love letters from books / world wide web.

(updated resources in later check ins)

Methodology

Our model’s architecture will be similar to assignment 4. It will involve RNN variants and maybe transformers that will allow the model to produce sentences that make sense in the English language and are love themed.

Metrics

What experiments do you plan to run?

Have a test group of users and send each of them a generated letter. Ask them to fill out a Google Form about their thoughts (will come up with survey questions)

For most of our assignments, we have looked at the accuracy of the model. Does the notion of “accuracy” apply for your project, or is some other metric more appropriate?

We won’t be working with labeled data so the notion of accuracy is not applicable to our project. However, we can measure how realistic the letters generated are by using other metrics such as perplexity.

If you are doing something new, explain how you will assess your model’s performance.

Perplexity
Survey Ratings

What are your base, target, and stretch goals?

Base Goal: Generate a grammatically correct love letter.
Target Goal: Generate an emotionally charged, moving, and grammatically correct love letter.
Stretch Goal: Receive input from a user regarding their personality, information on the receiver of the letter, and submit some of their own writing to inform the semantics and style of the love letter — from which a grammatically correct and personalized love letter would be made.

Potential metrics:

Perplexity
Cross entropy
Bits-per-character

Ethics

Why is Deep Learning a good approach to this problem?

Deep Learning is a good approach to this problem because it will allow us to easily interpret large amounts of data and form them into meaningful information. It will allow us to create human-like written love letters that will satisfy the needs and interests of our users. It can also give users inspiration and expose them to alternative styles of letter writing.

Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm?

The major “stakeholders” are the users that will be utilizing this program to create love letters. They are the ones who, for various reasons, will want to use a love letter generator powered by Deep Learning. Some of the consequences of mistakes made by our algorithm aren’t really that serious; however, they can impact the experience of a user by providing a faulty or bad love letter.