Teleports to Our Report, Github Repository, Presentation, and Poster
- Click Effectiveness of Meta-Learning with BERT to our report.
- Click MetaBERT to our Github repository.
- Click here to our presentation.
- Click here to our poster.
- Enjoy!
Effectiveness of Meta-Learning with BERT
- Tian Yun, Sinan Pehlivanoglu, Gaurav Sharma
Introduction
This project is to examine the effectiveness of SOTA meta-learning algorithms on top of pre-trained language model (i.e. BERT) on natural language understanding (NLU) tasks (i.e. GLUE dataset) and further on multimodal tasks (i.e. question answering, image captioning)
Motivation
- There are many works related to meta-learning and CV, but not that many in NLP.
- To examine whether meta-learning algorithm on top of pretrained language model could yield more general and robust word representations that are helpful for downstream tasks.
- To examine whether meta-learning algorithm on top of pretrained language model could converge faster than vanilla pretrained language model.
Related work
- One prior work that has tried meta-learning algorithms on top of BERT is Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks.
- Survey paper about meta-learning algorithms: Meta-learning for Few-shot Natural Language Processing: A Survey
- Model-Agnostic Meta-Learning
- REPTILE
- LEOPARD
- SMLMT
Data
Unimodal Tasks - GLUE
- SQuAD2.0: https://rajpurkar.github.io/SQuAD-explorer/ Multimodal Tasks (More search to be done)
- Meta-leanred BERT + Inception → VQA/GQA
Methodology and Architecture
- Meta-learning algorithm on top of BERT
- MAML, REPTILE, LEOPARD, SMTML
- Use the existing implementations to train the model on GLUE. This stage will be used to generate pre-trained word embeddings.
- Use the pre-trained word embeddings as an input to the multi-model architecture
Metrics and Experiments
- Experiment1: Compare different META-BERTs on GLUE dataset → Train with full training set.
- Experiment2: Train META-BERTs with different proportions of training set (i.e. 1%, 10%, 50%).
- Experiment3: Extend the first two experiments on multimodal tasks.
Evaluation
- If a BERT baseline for the task doesn’t exist, establish that
- Run the same task with META-BERT, compare with the previous baseline
- Repeat this process with different meta learning algorithms and compare the results of the experiments
- Evaluation metrics depend on the tasks
Ethics
- How efficient is Meta-BERT? - Plan to plot the accuracy-training time plot to evaluate the efficiency of the model.
- How can this solution be applied to a larger social problem
Division of Labor
- For the reflection checkpoint, we will work on MAML-BERT together.
Built With
- python
- tensorflow


Log in or sign up for Devpost to join the conversation.