|
--- |
|
language: en |
|
tags: |
|
- bert |
|
- sst2 |
|
- glue |
|
- torchdistill |
|
license: apache-2.0 |
|
datasets: |
|
- sst2 |
|
metrics: |
|
- accuracy |
|
--- |
|
|
|
`bert-large-uncased` fine-tuned on SST-2 dataset, using [***torchdistill***](https://github.com/yoshitomo-matsubara/torchdistill) and [Google Colab](https://colab.research.google.com/github/yoshitomo-matsubara/torchdistill/blob/master/demo/glue_finetuning_and_submission.ipynb). |
|
The hyperparameters are the same as those in Hugging Face's example and/or the paper of BERT, and the training configuration (including hyperparameters) is available [here](https://github.com/yoshitomo-matsubara/torchdistill/blob/main/configs/sample/glue/sst2/ce/bert_large_uncased.yaml). |
|
I submitted prediction files to [the GLUE leaderboard](https://gluebenchmark.com/leaderboard), and the overall GLUE score was **80.2**. |
|
|
|
Yoshitomo Matsubara: **"torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP"** at *EMNLP 2023 Workshop for Natural Language Processing Open Source Software (NLP-OSS)* |
|
|
|
[[OpenReview](https://openreview.net/forum?id=A5Axeeu1Bo)] [[Preprint](https://arxiv.org/abs/2310.17644)] |
|
```bibtex |
|
@article{matsubara2023torchdistill, |
|
title={{torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP}}, |
|
author={Matsubara, Yoshitomo}, |
|
journal={arXiv preprint arXiv:2310.17644}, |
|
year={2023} |
|
} |
|
``` |
|
|