Plainly Optimized Network
Dataset: BIGBENCH
Trainer Hyperparameters:
lr
= 5e-05per_device_batch_size
= 1gradient_accumulation_steps
= 4weight_decay
= 1e-09seed
= 42
eval_loss | eval_accuracy | epoch |
---|---|---|
58.940 | 0.054 | 1.0 |
54.182 | 0.049 | 2.0 |
56.362 | 0.051 | 3.0 |
52.705 | 0.046 | 4.0 |
55.357 | 0.050 | 5.0 |
53.973 | 0.048 | 6.0 |
56.034 | 0.050 | 7.0 |
51.731 | 0.045 | 8.0 |
54.661 | 0.048 | 9.0 |
50.378 | 0.043 | 10.0 |
51.579 | 0.044 | 11.0 |
51.193 | 0.044 | 12.0 |
52.724 | 0.046 | 13.0 |
52.055 | 0.045 | 14.0 |
51.406 | 0.044 | 15.0 |
51.539 | 0.045 | 16.0 |
52.422 | 0.046 | 17.0 |
50.304 | 0.043 | 18.0 |
50.937 | 0.044 | 19.0 |
- Downloads last month
- 1