enlm-roberta-130
This model is a fine-tuned version of manirai91/enlm-roberta-final on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4113
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 128
- total_train_batch_size: 8192
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
- lr_scheduler_type: polynomial
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.5183 | 0.34 | 160 | 1.4159 |
1.5188 | 0.69 | 320 | 1.4158 |
1.5205 | 1.03 | 480 | 1.4153 |
1.5213 | 1.37 | 640 | 1.4162 |
1.5195 | 1.72 | 800 | 1.4168 |
1.5194 | 2.06 | 960 | 1.4150 |
1.5182 | 2.4 | 1120 | 1.4142 |
1.5182 | 2.75 | 1280 | 1.4131 |
1.5177 | 3.09 | 1440 | 1.4167 |
1.5201 | 3.43 | 1600 | 1.4156 |
1.5173 | 3.78 | 1760 | 1.4111 |
1.52 | 4.12 | 1920 | 1.4117 |
1.5184 | 4.46 | 2080 | 1.4151 |
1.5198 | 4.81 | 2240 | 1.4097 |
1.5202 | 5.15 | 2400 | 1.4162 |
1.5166 | 5.49 | 2560 | 1.4130 |
1.5184 | 5.84 | 2720 | 1.4139 |
1.5174 | 6.18 | 2880 | 1.4128 |
1.5161 | 6.52 | 3040 | 1.4126 |
1.5175 | 6.87 | 3200 | 1.4095 |
1.5169 | 7.21 | 3360 | 1.4118 |
1.516 | 7.55 | 3520 | 1.4113 |
1.5182 | 7.9 | 3680 | 1.4097 |
1.5195 | 8.24 | 3840 | 1.4118 |
1.5187 | 8.26 | 4000 | 1.4119 |
1.5149 | 8.6 | 4160 | 1.4133 |
1.5183 | 8.94 | 4320 | 1.4097 |
1.5192 | 9.29 | 4480 | 1.4101 |
1.5191 | 9.63 | 4640 | 1.4146 |
1.5192 | 9.97 | 4800 | 1.4165 |
1.5164 | 10.32 | 4960 | 1.4119 |
1.5235 | 10.66 | 5120 | 1.4089 |
1.6571 | 11.0 | 5280 | 1.4121 |
1.5184 | 11.35 | 5440 | 1.4102 |
1.5185 | 11.69 | 5600 | 1.4111 |
1.5172 | 12.03 | 5760 | 1.4142 |
1.5189 | 12.38 | 5920 | 1.4129 |
1.5147 | 12.72 | 6080 | 1.4089 |
1.5177 | 13.06 | 6240 | 1.4098 |
1.5164 | 13.41 | 6400 | 1.4097 |
1.5188 | 13.75 | 6560 | 1.4109 |
1.5158 | 14.09 | 6720 | 1.4134 |
1.5134 | 14.44 | 6880 | 1.4091 |
1.5167 | 14.78 | 7040 | 1.4089 |
1.5163 | 15.12 | 7200 | 1.4140 |
1.5172 | 15.47 | 7360 | 1.4083 |
1.5153 | 15.81 | 7520 | 1.4109 |
1.5164 | 16.15 | 7680 | 1.4093 |
1.5164 | 16.17 | 7840 | 1.4108 |
1.515 | 16.51 | 8000 | 1.4102 |
1.5164 | 16.86 | 8160 | 1.4090 |
1.5163 | 17.2 | 8320 | 1.4110 |
1.5142 | 17.54 | 8480 | 1.4122 |
1.5166 | 17.89 | 8640 | 1.4092 |
1.5172 | 18.23 | 8800 | 1.4058 |
1.5153 | 18.57 | 8960 | 1.4112 |
1.517 | 18.92 | 9120 | 1.4098 |
1.5163 | 19.26 | 9280 | 1.4113 |
Framework versions
- Transformers 4.24.0
- Pytorch 1.11.0
- Datasets 2.7.0
- Tokenizers 0.13.2
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.