sst5-gpt2-kd / README.md
kennethge123's picture
Upload README.md with huggingface_hub
10e0f2c verified
|
raw
history blame
No virus
658 Bytes
metadata
language: en
license: mit
library_name: pytorch

Plainly Optimized Network

Dataset: BIGBENCH

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 1
  • gradient_accumulation_steps = 4
  • weight_decay = 1e-09
  • seed = 42
eval_loss eval_mse epoch
58.741 0.055 1.0
60.624 0.058 2.0
60.765 0.057 3.0
55.858 0.051 4.0
57.271 0.053 5.0
56.004 0.051 6.0
60.246 0.056 7.0
55.218 0.049 8.0
55.261 0.049 9.0
54.730 0.049 10.0
58.137 0.052 11.0
53.927 0.048 12.0
56.143 0.051 13.0
54.604 0.049 14.0
53.596 0.048 15.0
54.241 0.049 16.0
55.500 0.050 17.0
53.256 0.047 18.0
53.139 0.047 19.0