File size: 1,655 Bytes
20cfc23 5b860a7 94fd168 5b860a7 94fd168 5b860a7 94fd168 5b860a7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
---
language: en
tags:
- bert
- qqp
- glue
- kd
- torchdistill
license: apache-2.0
datasets:
- qqp
metrics:
- f1
- accuracy
---
`bert-base-uncased` fine-tuned on QQP dataset, using fine-tuned `bert-large-uncased` as a teacher model, [***torchdistill***](https://github.com/yoshitomo-matsubara/torchdistill) and [Google Colab](https://colab.research.google.com/github/yoshitomo-matsubara/torchdistill/blob/master/demo/glue_kd_and_submission.ipynb) for knowledge distillation.
The training configuration (including hyperparameters) is available [here](https://github.com/yoshitomo-matsubara/torchdistill/blob/main/configs/sample/glue/qqp/kd/bert_base_uncased_from_bert_large_uncased.yaml).
I submitted prediction files to [the GLUE leaderboard](https://gluebenchmark.com/leaderboard), and the overall GLUE score was **78.9**.
Yoshitomo Matsubara: **"torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP"** at *EMNLP 2023 Workshop for Natural Language Processing Open Source Software (NLP-OSS)*
[[Paper](https://aclanthology.org/2023.nlposs-1.18/)] [[OpenReview](https://openreview.net/forum?id=A5Axeeu1Bo)] [[Preprint](https://arxiv.org/abs/2310.17644)]
```bibtex
@inproceedings{matsubara2023torchdistill,
title={{torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP}},
author={Matsubara, Yoshitomo},
booktitle={Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)},
publisher={Empirical Methods in Natural Language Processing},
pages={153--164},
year={2023}
}
```
|