Inspiration

As a team, we've always been curious as to the power of cutting-edge machine learning and its potential. One facet of machine learning that has captured our attention, in particular, was a sequence to sequence networks which when given some text input will generate text output. Paired with a curiosity in regards to online data, we became curious if we could create a neural network that would learn to text like us, capturing the vocabulary we use and the general tones we speak in.

What it does

DigitalMe is a smart chatbot with a personality - your personality. As a user starts the conversation by entering a sentence, the machine learning algorithm in DigitalMe processes it and comes up with a response in the style of a particular individual, attempting to capture their texting habits and style. As the conversation progresses, DigitalMe remembers context and is able to hold conversations pertaining to a broad range of topics.

How we built it

We first downloaded each team member's chat logs from Facebook Messenger, and then leveraged the GPT-2 library in Python and performed transfer learning to train it on the processed datasets. Each member's chat data was then parsed to remove unrecognizable characters and reformatted from a JSON into a text file that GPT-2 could understand. We then ran multiple training iterations and tested different model hyperparameters in an effort to minimize loss as best we could. Once training was complete and we were satisfied with the state of the model, we saved the parameters and downloaded them to be incorporated in a chatbot program that passed user messages into the model and presented the outputs from the trained model.

To increase the fidelity of our product we began work on a full-stack web app utilizing Reactjs and Flask. However, due to time constraints, we were unable to fully complete the web app.

Challenges we ran into

  • GPT-2 offers multiple models such as 124M or 774M, with the name corresponding to a number of parameters (in millions). With increasing parameters, the file size for the trained model increases too. The model we chose in the end (355M) creates files as large as 1.5GB, making Github uploads and file transfers time-consuming.

  • The size of the data also gave us issues. We trained the model with chat logs that go back for 10+ years, which makes the input data size to be 20MB. This caused our model to take approximately 1 hour of training to produce satisfactory results.

  • TensorFlow and Python need to have compatible versions (namely 1.15 and 3.7) to be able to work with GPT-2, which took some time to debug.

Accomplishments that we're proud of

The trained model returns surprisingly great results: it responds in coherent sentences and remembers context from earlier in the conversation.

What we learned

  • Transfer learning is incredibly useful when prototyping a Machine Learning project with a similar purpose to a prior project, offering fast training and eliminating development time.

  • Google Colab is very helpful when training Machine Learning models, providing free GPU resources to speed up the process.

  • With more parameters in a machine learning model, it is able to remember more about the context of the conversation.

  • Integrate the application with Facebook Messenger through webhooks to create an automated chat bot

  • Create a Flask application to serve as a backend to connect ML system to Facebook Messenger

  • Deploy flask application onto a cloud computing system

  • Dockerize flask application to reduce issues native to developers or deployments

Next Steps

  • Dynamically adapt the model to be trained for any user's texting habits from their messaging data

  • Add more parameters to the Machine Learning model, so it can remember more about the context of the conversation.

  • Integrate application with Facebook Messenger through webhooks to create an automated chat bot

  • Create a Flask application to serve as a backend to connect ML system to Facebook Messenger

  • Deploy flask application onto a cloud computing system

  • Dockerize flask application to reduce issues native to developers or deployments

Built With

Share this project:

Updates