--- tags: - generated_from_trainer datasets: - oscar-corpus/OSCAR-2109 model-index: - name: runs results: [] --- # runs This model was trained from scratch on the oscar-corpus/OSCAR-2109 deduplicated_lo dataset. It achieves the following results on the evaluation set: - Loss: 1.4556 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 128 - eval_batch_size: 128 - seed: 42 - distributed_type: tpu - num_devices: 8 - total_train_batch_size: 1024 - total_eval_batch_size: 1024 - optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 1000 - num_epochs: 30.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | No log | 1.0 | 216 | 5.8586 | | No log | 2.0 | 432 | 5.5095 | | 6.688 | 3.0 | 648 | 5.3976 | | 6.688 | 4.0 | 864 | 5.3562 | | 5.3629 | 5.0 | 1080 | 5.2912 | | 5.3629 | 6.0 | 1296 | 5.2385 | | 5.22 | 7.0 | 1512 | 5.1955 | | 5.22 | 8.0 | 1728 | 5.1785 | | 5.22 | 9.0 | 1944 | 5.1327 | | 5.1248 | 10.0 | 2160 | 5.1243 | | 5.1248 | 11.0 | 2376 | 5.0889 | | 5.0591 | 12.0 | 2592 | 5.0732 | | 5.0591 | 13.0 | 2808 | 5.0417 | | 5.0094 | 14.0 | 3024 | 5.0388 | | 5.0094 | 15.0 | 3240 | 4.9299 | | 5.0094 | 16.0 | 3456 | 4.2991 | | 4.7527 | 17.0 | 3672 | 3.6541 | | 4.7527 | 18.0 | 3888 | 2.7826 | | 3.4431 | 19.0 | 4104 | 2.2796 | | 3.4431 | 20.0 | 4320 | 2.0213 | | 2.2803 | 21.0 | 4536 | 1.8809 | | 2.2803 | 22.0 | 4752 | 1.7615 | | 2.2803 | 23.0 | 4968 | 1.6925 | | 1.8601 | 24.0 | 5184 | 1.6205 | | 1.8601 | 25.0 | 5400 | 1.5751 | | 1.6697 | 26.0 | 5616 | 1.5391 | | 1.6697 | 27.0 | 5832 | 1.5200 | | 1.5655 | 28.0 | 6048 | 1.4866 | | 1.5655 | 29.0 | 6264 | 1.4656 | | 1.5655 | 30.0 | 6480 | 1.4627 | ### Framework versions - Transformers 4.13.0.dev0 - Pytorch 1.9.0+cu102 - Datasets 1.16.1 - Tokenizers 0.10.3