File size: 4,767 Bytes
d045310 35bbf69 d045310 35bbf69 d045310 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
---
license: apache-2.0
base_model: AmberYifan/mistral-safe-sft-full
tags:
- generated_from_trainer
model-index:
- name: mistral-sft4epoch-spin-v
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# mistral-sft4epoch-spin-v
This model is a fine-tuned version of [AmberYifan/mistral-safe-sft-full](https://huggingface.co/AmberYifan/mistral-safe-sft-full) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2284
- Rewards/real: 10.1344
- Rewards/generated: -5.3158
- Rewards/accuracies: 1.0
- Rewards/margins: 15.4503
- Logps/generated: -131.8755
- Logps/real: -111.3366
- Logits/generated: -2.7694
- Logits/real: -2.7499
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real |
|:-------------:|:------:|:----:|:---------------:|:------------:|:-----------------:|:------------------:|:---------------:|:---------------:|:----------:|:----------------:|:-----------:|
| 0.278 | 0.0640 | 100 | 0.2703 | 8.6366 | -3.4251 | 0.9922 | 12.0617 | -112.9675 | -126.3148 | -2.9055 | -2.8963 |
| 0.2283 | 0.1280 | 200 | 0.2438 | 9.5699 | -4.6271 | 0.9922 | 14.1970 | -124.9880 | -116.9817 | -2.8308 | -2.8192 |
| 0.2284 | 0.1919 | 300 | 0.2384 | 9.7849 | -5.0781 | 0.9922 | 14.8630 | -129.4981 | -114.8321 | -2.8396 | -2.8204 |
| 0.2154 | 0.2559 | 400 | 0.2361 | 9.8971 | -4.8914 | 0.9922 | 14.7885 | -127.6311 | -113.7101 | -2.8303 | -2.8085 |
| 0.2368 | 0.3199 | 500 | 0.2351 | 9.9762 | -5.0488 | 0.9922 | 15.0249 | -129.2045 | -112.9195 | -2.8228 | -2.8083 |
| 0.2065 | 0.3839 | 600 | 0.2346 | 10.0426 | -4.9610 | 0.9922 | 15.0035 | -128.3267 | -112.2554 | -2.8204 | -2.8086 |
| 0.2244 | 0.4479 | 700 | 0.2317 | 10.0417 | -5.1299 | 1.0 | 15.1716 | -130.0162 | -112.2640 | -2.8203 | -2.8076 |
| 0.2161 | 0.5118 | 800 | 0.2297 | 10.0737 | -5.0565 | 1.0 | 15.1303 | -129.2824 | -111.9440 | -2.8437 | -2.8337 |
| 0.2127 | 0.5758 | 900 | 0.2302 | 10.0913 | -5.0905 | 1.0 | 15.1818 | -129.6217 | -111.7683 | -2.8251 | -2.8150 |
| 0.2017 | 0.6398 | 1000 | 0.2298 | 10.1245 | -5.2627 | 1.0 | 15.3872 | -131.3441 | -111.4362 | -2.7955 | -2.7831 |
| 0.2152 | 0.7038 | 1100 | 0.2297 | 10.0889 | -5.3503 | 1.0 | 15.4392 | -132.2204 | -111.7925 | -2.7790 | -2.7609 |
| 0.2074 | 0.7678 | 1200 | 0.2298 | 10.1143 | -5.3204 | 1.0 | 15.4346 | -131.9209 | -111.5385 | -2.7919 | -2.7734 |
| 0.2107 | 0.8317 | 1300 | 0.2287 | 10.1349 | -5.3137 | 1.0 | 15.4486 | -131.8539 | -111.3324 | -2.7734 | -2.7524 |
| 0.1947 | 0.8957 | 1400 | 0.2288 | 10.1265 | -5.3252 | 1.0 | 15.4517 | -131.9686 | -111.4160 | -2.7803 | -2.7613 |
| 0.2056 | 0.9597 | 1500 | 0.2284 | 10.1344 | -5.3158 | 1.0 | 15.4503 | -131.8755 | -111.3366 | -2.7694 | -2.7499 |
### Framework versions
- Transformers 4.43.3
- Pytorch 2.2.2+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
|