results_modified
This model is a fine-tuned version of IlyaGusev/saiga_llama3_8b on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4969
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.8056 | 0.0278 | 25 | 1.7350 |
1.7606 | 0.0556 | 50 | 1.7129 |
1.7263 | 0.0833 | 75 | 1.6759 |
1.6579 | 0.1111 | 100 | 1.6324 |
1.7242 | 0.1389 | 125 | 1.5937 |
1.551 | 0.1667 | 150 | 1.5535 |
1.4649 | 0.1944 | 175 | 1.5301 |
1.5771 | 0.2222 | 200 | 1.5207 |
1.5372 | 0.25 | 225 | 1.5158 |
1.5232 | 0.2778 | 250 | 1.5124 |
1.5376 | 0.3056 | 275 | 1.5100 |
1.444 | 0.3333 | 300 | 1.5080 |
1.5023 | 0.3611 | 325 | 1.5064 |
1.4652 | 0.3889 | 350 | 1.5051 |
1.5141 | 0.4167 | 375 | 1.5041 |
1.529 | 0.4444 | 400 | 1.5033 |
1.5313 | 0.4722 | 425 | 1.5023 |
1.4563 | 0.5 | 450 | 1.5013 |
1.5887 | 0.5278 | 475 | 1.5006 |
1.6021 | 0.5556 | 500 | 1.4999 |
1.4809 | 0.5833 | 525 | 1.4993 |
1.5844 | 0.6111 | 550 | 1.4989 |
1.4749 | 0.6389 | 575 | 1.4984 |
1.5289 | 0.6667 | 600 | 1.4981 |
1.5212 | 0.6944 | 625 | 1.4978 |
1.451 | 0.7222 | 650 | 1.4976 |
1.5053 | 0.75 | 675 | 1.4974 |
1.348 | 0.7778 | 700 | 1.4973 |
1.5542 | 0.8056 | 725 | 1.4973 |
1.452 | 0.8333 | 750 | 1.4970 |
1.4509 | 0.8611 | 775 | 1.4970 |
1.5192 | 0.8889 | 800 | 1.4969 |
1.5105 | 0.9167 | 825 | 1.4970 |
1.5116 | 0.9444 | 850 | 1.4970 |
1.4249 | 0.9722 | 875 | 1.4969 |
1.5206 | 1.0 | 900 | 1.4969 |
Framework versions
- PEFT 0.11.1
- Transformers 4.41.2
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 2
Model tree for jester20/results_modified
Base model
IlyaGusev/saiga_llama3_8b