--- language: en license: mit library_name: pytorch --- # Plainly Optimized Network Dataset: BIGBENCH Trainer Hyperparameters: - `lr` = 5e-05 - `per_device_batch_size` = 8 - `gradient_accumulation_steps` = 2 - `weight_decay` = 0.0 - `seed` = 42 |eval_loss|eval_accuracy|epoch| |--|--|--| |10.410|0.571|1.0| |10.191|0.571|2.0| |9.468|0.643|3.0| |10.414|0.571|4.0| |10.468|0.571|5.0| |10.335|0.571|6.0| |10.296|0.571|7.0| |9.998|0.571|8.0| |10.080|0.571|9.0| |10.186|0.571|10.0| |9.862|0.571|11.0| |10.713|0.500|12.0| |9.873|0.571|13.0| |9.905|0.571|14.0| |9.860|0.571|15.0| |9.997|0.571|16.0| |9.823|0.571|17.0| |9.840|0.571|18.0| |9.817|0.571|19.0|