--- language: en license: mit library_name: pytorch --- # Plainly Optimized Network Dataset: BIGBENCH Trainer Hyperparameters: - `lr` = 5e-05 - `per_device_batch_size` = 8 - `gradient_accumulation_steps` = 2 - `weight_decay` = 0.0 - `seed` = 42 |eval_loss|eval_accuracy|epoch| |--|--|--| |10.410|0.571|1.0| |10.191|0.571|2.0| |9.468|0.643|3.0|