vanherzog's picture
vanherzog/Mixtral_Alpace_v2_NIKI
279eb54 verified
metadata
license: apache-2.0
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: mistralai/Mixtral-8x7B-v0.1
datasets:
  - generator
model-index:
  - name: Mixtral_Alpace_v2_NIKI
    results: []

Mixtral_Alpace_v2_NIKI

This model is a fine-tuned version of mistralai/Mixtral-8x7B-v0.1 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.03
  • training_steps: 300

Training results

Training Loss Epoch Step Validation Loss
1.3725 0.0606 10 1.3384
1.339 0.1212 20 1.3260
1.3448 0.1818 30 1.3121
1.2777 0.2424 40 1.2984
1.3067 0.3030 50 1.2853
1.2674 0.3636 60 1.2723
1.2842 0.4242 70 1.2610
1.2835 0.4848 80 1.2505
1.2688 0.5455 90 1.2406
1.2892 0.6061 100 1.2315
1.2565 0.6667 110 1.2236
1.2145 0.7273 120 1.2163
1.2297 0.7879 130 1.2101
1.2406 0.8485 140 1.2042
1.2146 0.9091 150 1.1986
1.2386 0.9697 160 1.1940
1.1929 1.0303 170 1.1899
1.2036 1.0909 180 1.1869
1.181 1.1515 190 1.1837
1.201 1.2121 200 1.1812
1.1965 1.2727 210 1.1786
1.2084 1.3333 220 1.1765
1.2097 1.3939 230 1.1746
1.176 1.4545 240 1.1727
1.1757 1.5152 250 1.1715
1.1977 1.5758 260 1.1705
1.1686 1.6364 270 1.1701
1.1679 1.6970 280 1.1694
1.1779 1.7576 290 1.1690
1.179 1.8182 300 1.1688

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1