ben-wycliff's picture
Model save
15defba verified
|
raw
history blame
No virus
2.48 kB
metadata
license: apache-2.0
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: mistralai/Mistral-7B-v0.1
datasets:
  - generator
model-index:
  - name: sexed-mistral-7b-sft-lora-v3
    results: []

sexed-mistral-7b-sft-lora-v3

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2857

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
0.4906 0.9902 38 0.4776
0.3966 1.9805 76 0.4122
0.3648 2.9967 115 0.3658
0.3405 3.9870 153 0.3372
0.3153 4.9772 191 0.3180
0.2996 5.9935 230 0.3043
0.2987 6.9837 268 0.2969
0.2888 8.0 307 0.2923
0.2899 8.9902 345 0.2898
0.2873 9.9805 383 0.2883
0.2831 10.9967 422 0.2872
0.2773 11.9870 460 0.2866
0.2814 12.9772 498 0.2862
0.2781 13.9935 537 0.2860
0.2845 14.9837 575 0.2858
0.29 16.0 614 0.2858
0.2799 16.9902 652 0.2857
0.2825 17.9805 690 0.2857
0.2829 18.9967 729 0.2857
0.2801 19.8046 760 0.2857

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1