--- license: apache-2.0 library_name: peft tags: - trl - sft - generated_from_trainer base_model: mistralai/Mistral-7B-Instruct-v0.1 model-index: - name: mistral-7b-instruct-autextification2024 results: [] --- # mistral-7b-instruct-autextification2024 This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.8230 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 4 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - lr_scheduler_warmup_ratio: 0.03 - training_steps: 500 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 1.7076 | 0.0 | 10 | 2.0916 | | 1.54 | 0.01 | 20 | 2.0382 | | 2.0394 | 0.01 | 30 | 1.9987 | | 2.3388 | 0.01 | 40 | 1.9706 | | 3.0378 | 0.02 | 50 | 1.9866 | | 1.511 | 0.02 | 60 | 1.9453 | | 1.6499 | 0.02 | 70 | 1.9309 | | 1.9693 | 0.03 | 80 | 1.9168 | | 2.2389 | 0.03 | 90 | 1.9169 | | 2.7812 | 0.03 | 100 | 1.9367 | | 1.542 | 0.04 | 110 | 1.9202 | | 1.574 | 0.04 | 120 | 1.9088 | | 1.9916 | 0.04 | 130 | 1.8989 | | 2.081 | 0.05 | 140 | 1.8862 | | 2.768 | 0.05 | 150 | 1.9108 | | 1.4699 | 0.05 | 160 | 1.8984 | | 1.5366 | 0.06 | 170 | 1.8877 | | 2.0133 | 0.06 | 180 | 1.8812 | | 2.2186 | 0.06 | 190 | 1.8795 | | 2.7003 | 0.07 | 200 | 1.8882 | | 1.5169 | 0.07 | 210 | 1.8720 | | 1.5444 | 0.07 | 220 | 1.8801 | | 1.726 | 0.08 | 230 | 1.8732 | | 2.0348 | 0.08 | 240 | 1.8657 | | 2.6121 | 0.09 | 250 | 1.8702 | | 1.5258 | 0.09 | 260 | 1.8655 | | 1.5423 | 0.09 | 270 | 1.8733 | | 1.8095 | 0.1 | 280 | 1.8505 | | 2.0462 | 0.1 | 290 | 1.8455 | | 2.5442 | 0.1 | 300 | 1.8552 | | 1.4565 | 0.11 | 310 | 1.8586 | | 1.4278 | 0.11 | 320 | 1.8491 | | 1.7626 | 0.11 | 330 | 1.8358 | | 1.9469 | 0.12 | 340 | 1.8427 | | 2.5378 | 0.12 | 350 | 1.8580 | | 1.4248 | 0.12 | 360 | 1.8499 | | 1.586 | 0.13 | 370 | 1.8378 | | 1.9637 | 0.13 | 380 | 1.8311 | | 1.9733 | 0.13 | 390 | 1.8352 | | 2.6789 | 0.14 | 400 | 1.8543 | | 1.4521 | 0.14 | 410 | 1.8411 | | 1.4683 | 0.14 | 420 | 1.8428 | | 1.862 | 0.15 | 430 | 1.8331 | | 2.0159 | 0.15 | 440 | 1.8304 | | 2.5851 | 0.15 | 450 | 1.8385 | | 1.4911 | 0.16 | 460 | 1.8309 | | 1.5463 | 0.16 | 470 | 1.8262 | | 1.8454 | 0.16 | 480 | 1.8137 | | 2.0086 | 0.17 | 490 | 1.8143 | | 2.6965 | 0.17 | 500 | 1.8230 | ### Framework versions - PEFT 0.10.0 - Transformers 4.39.1 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2