tanshi-models-224-ep10
This model is a fine-tuned version of YxBxRyXJx/tanshi-models-224 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8686
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0.1351 | 20 | 1.0816 |
No log | 0.2703 | 40 | 1.1103 |
1.2252 | 0.4054 | 60 | 1.0766 |
1.2252 | 0.5405 | 80 | 1.1650 |
1.2016 | 0.6757 | 100 | 1.0430 |
1.2016 | 0.8108 | 120 | 1.0198 |
1.2016 | 0.9459 | 140 | 1.0058 |
1.1913 | 1.0811 | 160 | 1.0173 |
1.1913 | 1.2162 | 180 | 1.0246 |
1.1996 | 1.3514 | 200 | 1.0407 |
1.1996 | 1.4865 | 220 | 1.0185 |
1.1996 | 1.6216 | 240 | 0.9941 |
1.1931 | 1.7568 | 260 | 0.9983 |
1.1931 | 1.8919 | 280 | 1.0098 |
1.1227 | 2.0270 | 300 | 1.0127 |
1.1227 | 2.1622 | 320 | 0.9726 |
1.1227 | 2.2973 | 340 | 0.9944 |
1.1479 | 2.4324 | 360 | 1.0146 |
1.1479 | 2.5676 | 380 | 0.9614 |
1.0578 | 2.7027 | 400 | 0.9794 |
1.0578 | 2.8378 | 420 | 0.9699 |
1.0578 | 2.9730 | 440 | 0.9782 |
1.1325 | 3.1081 | 460 | 0.9551 |
1.1325 | 3.2432 | 480 | 0.9714 |
1.0768 | 3.3784 | 500 | 0.9524 |
1.0768 | 3.5135 | 520 | 0.9540 |
1.0768 | 3.6486 | 540 | 0.9115 |
1.067 | 3.7838 | 560 | 0.8934 |
1.067 | 3.9189 | 580 | 0.9231 |
1.0786 | 4.0541 | 600 | 0.9242 |
1.0786 | 4.1892 | 620 | 0.8910 |
1.0786 | 4.3243 | 640 | 0.8810 |
1.0959 | 4.4595 | 660 | 0.8875 |
1.0959 | 4.5946 | 680 | 0.8753 |
1.0206 | 4.7297 | 700 | 0.8724 |
1.0206 | 4.8649 | 720 | 0.8699 |
1.0206 | 5.0 | 740 | 0.8686 |
Framework versions
- Transformers 4.46.1
- Pytorch 2.5.1+cu124
- Datasets 3.0.2
- Tokenizers 0.20.1
- Downloads last month
- 53
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.