sst5-gpt2-kd / README.md
kennethge123's picture
Upload README.md with huggingface_hub
82c7a0a verified
|
raw
history blame
292 Bytes
metadata
language: en
license: mit
library_name: pytorch

Plainly Optimized Network

Dataset: BIGBENCH

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 2
  • gradient_accumulation_steps = 4
  • weight_decay = 1e-09
  • seed = 42
eval_loss eval_accuracy epoch