Title
Communication with GPT-2
Who
Yuanqi Li (yli322), Yifan Jiang (yjiang79), Weike Dai (wdai8), Zhe Hu (zhu24)
Introduction
From what we have learned in class, we know that we could use GPT-2 to generate conversations, however, we have a significant problem that the GPT-2 could not generate good responses if we don't provide excessive contexts. GPT-2 usually tries to provide a vague and generic answer to your question if there is no context in users’ inputs. Our solution is to provide a "pre-process" method that transfers the input sentences given from human users to some more structured and detailed paragraphs, which could enhance the performance of conversation generation if we perform fine-tuning correctly. Our project is about transfer learning in the NLP area.
Related Work
One of the papers introduces a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. In this work, they made a step toward more consistent and relevant data-driven conversational agents by proposing a model architecture, associated training and generation algorithms which are able to significantly improve over the traditional seq-2-seq and information-retrieval baselines in terms of (i) relevance of the answer (ii) coherence with a predefined personality and dialog history, and (iii) grammaticality and fluency as evaluated by automatic metrics. Link: TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
Data
TCS DATASET, the open-source dataset from Amazon Alexa. It provides data to enrich entities by adding knowledge to them. This knowledge is from Wikipedia. We choose this dataset because Alexa focuses a lot on the chatbot tasks, and its dataset is comprehensive enough to provide the information we need. For the dataset, we still do not get a concise number since we need to train and test then we will know it.
Methodology
The main task is to perform transfer learning: we are supposed to use a pre-trained GPT-2 model to generate dialogue. After we get the inputs from the users, we need to perform a pre-process on the input sentences: we want to use traditional methods to extract the keywords of the sentences, then try to find and generate relevant information in our system. Afterward, we will feed the processed data to GPT-2, and let it generate paragraphs. What we do is to find a good way to generate contexts for given inputs, and fine-tune the GPT-2 model in order to make it work better for our processed inputs.
Metrics
The target goal for this project is improving the performance of language modeling for the GPT-2. We will evaluate our model by comparing benchmarks of language modeling before and after using this transfer learning model. The testing data is the conversation dataset of the TCS DATASET. The corresponding metrics are perplexity, Hits@1 and F1. Apart from the target goal, we have a stretch goal to improve the performance on other NLP sub-tasks. So we're measuring these metrics on other NLP sub-tasks, such as reading comprehension, summarization, and translation to see if there are any improvements.
Ethics
Choose 2 of the following bullet points to discuss; not all questions will be relevant to all projects so try to pick questions where there’s interesting engagement with your project. (Remember that there’s not necessarily an ethical/unethical binary; rather, we want to encourage you to think critically about your problem setup.) Why is Deep Learning a good approach to this problem? Deeping learning is broadly used in NLP tasks. Now GPT-2 is proposed and proved better than other existing models. It could make for a better conversation generation. Moreover, in deep learning there’s also a fine-tuning mechanism to improve models, thus we choose to apply it to further improve the result we get from GPT-2. What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain? Our dataset is from Amazon Alexa and they are already open-source. Since Alexa is the cutting-edge product in the chatbot area, the dataset should be convincing because lots of scientists already do some filter and clean the data to create the best dataset.




Log in or sign up for Devpost to join the conversation.