Teleports to Our Report, Github Repository, Presentation, and Poster

Effectiveness of Meta-Learning with BERT

  • Tian Yun, Sinan Pehlivanoglu, Gaurav Sharma

Introduction

This project is to examine the effectiveness of SOTA meta-learning algorithms on top of pre-trained language model (i.e. BERT) on natural language understanding (NLU) tasks (i.e. GLUE dataset) and further on multimodal tasks (i.e. question answering, image captioning)

Motivation

  • There are many works related to meta-learning and CV, but not that many in NLP.
  • To examine whether meta-learning algorithm on top of pretrained language model could yield more general and robust word representations that are helpful for downstream tasks.
  • To examine whether meta-learning algorithm on top of pretrained language model could converge faster than vanilla pretrained language model.

Related work

Data

Unimodal Tasks - GLUE

Methodology and Architecture

  • Meta-learning algorithm on top of BERT
  • MAML, REPTILE, LEOPARD, SMTML
  • Use the existing implementations to train the model on GLUE. This stage will be used to generate pre-trained word embeddings.
  • Use the pre-trained word embeddings as an input to the multi-model architecture

Metrics and Experiments

  • Experiment1: Compare different META-BERTs on GLUE dataset → Train with full training set.
  • Experiment2: Train META-BERTs with different proportions of training set (i.e. 1%, 10%, 50%).
  • Experiment3: Extend the first two experiments on multimodal tasks.

Evaluation

  • If a BERT baseline for the task doesn’t exist, establish that
  • Run the same task with META-BERT, compare with the previous baseline
  • Repeat this process with different meta learning algorithms and compare the results of the experiments
  • Evaluation metrics depend on the tasks

Ethics

  • How efficient is Meta-BERT? - Plan to plot the accuracy-training time plot to evaluate the efficiency of the model.
  • How can this solution be applied to a larger social problem

Division of Labor

  • For the reflection checkpoint, we will work on MAML-BERT together.

Built With

Share this project:

Updates

posted an update

Project Checkin 2

Introduction:

This project is to examine the effectiveness of SOTA meta-learning algorithms on top of pre-trained language model (i.e. BERT) on natural language understanding (NLU) tasks (i.e. GLUE dataset) and further on multimodal tasks (i.e. question answering, image captioning)

Challenges:

The biggest challenge so far for us has been resource management and debugging performance issues. Our program is exhausting huge amounts of memory and disc space. Because we are using a pretrained model, we don't have great insight into what is going on under the hood and how it can be managed better.

Insights:

Unfortunately we are yet to run our program on the full data set. While our code runs and produces results on a subset of the data, as mentioned above it crashes on the full data set. There are no viable point of references for the expected performance on a subset of data so we are not able to make any assessments regarding the performance of the model at this time.

Plan:

The absolute priority is to fix the performance issues since the next step is to run tons of experiments and we can not do it without a fully functioning code. The original plan was to test and compare 4 different implementations (MAML, REPTILE, LEOPARD, SMLMT), in light of the current setbacks we have decided to drop LEOPARD and SMLMT and focus on the former two

Log in or sign up for Devpost to join the conversation.