--- language: en license: mit library_name: pytorch --- # Plainly Optimized Network Dataset: BIGBENCH Trainer Hyperparameters: - `lr` = 5e-05 - `per_device_batch_size` = 1 - `gradient_accumulation_steps` = 4 - `weight_decay` = 1e-09 - `seed` = 42 |eval_loss|eval_accuracy|epoch| |--|--|--| |66.323|0.063|1.0| |59.935|0.055|2.0| |60.344|0.056|3.0| |58.559|0.054|4.0| |56.373|0.051|5.0| |58.011|0.053|6.0| |64.814|0.059|7.0| |54.974|0.048|8.0| |59.489|0.055|9.0| |55.248|0.049|10.0| |51.685|0.044|11.0| |54.073|0.048|12.0| |57.350|0.051|13.0| |54.031|0.048|14.0| |53.526|0.048|15.0| |53.041|0.047|16.0| |55.731|0.050|17.0| |52.224|0.045|18.0| |52.757|0.046|19.0|