File size: 5,447 Bytes
3b29619
 
 
 
c7d92f2
3b29619
 
 
c7d92f2
 
 
 
 
 
3b29619
 
 
 
 
 
 
 
 
 
c7d92f2
3b29619
c7d92f2
 
 
 
 
 
 
 
 
3b29619
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: apache-2.0
base_model: alignment-handbook/zephyr-7b-sft-full
tags:
- alignment-handbook
- trl
- dpo
- generated_from_trainer
- trl
- dpo
- generated_from_trainer
datasets:
- HuggingFaceH4/ultrafeedback_binarized
- AmberYifan/safetyQA_DPO
model-index:
- name: zephyr-7b-sft-safeDPO
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# zephyr-7b-sft-safeDPO

This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the HuggingFaceH4/ultrafeedback_binarized and the AmberYifan/safetyQA_DPO datasets.
It achieves the following results on the evaluation set:
- Loss: 0.5427
- Rewards/chosen: -2.2653
- Rewards/rejected: -2.9944
- Rewards/accuracies: 0.7197
- Rewards/margins: 0.7291
- Logps/rejected: -469.7640
- Logps/chosen: -389.2892
- Logits/rejected: -1.7787
- Logits/chosen: -1.7892

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.6809        | 0.06  | 100  | 0.6816          | -0.0896        | -0.1145          | 0.6090             | 0.0249          | -181.7760      | -171.7131    | -2.5505         | -2.5392       |
| 0.6002        | 0.12  | 200  | 0.5906          | -1.4710        | -1.9163          | 0.6844             | 0.4453          | -361.9523      | -309.8549    | -2.3566         | -2.3656       |
| 0.591         | 0.17  | 300  | 0.5809          | -2.0305        | -2.5280          | 0.6924             | 0.4975          | -423.1273      | -365.8047    | -2.1933         | -2.2068       |
| 0.5437        | 0.23  | 400  | 0.5684          | -1.8038        | -2.3666          | 0.7032             | 0.5628          | -406.9888      | -343.1347    | -1.8974         | -1.9247       |
| 0.5415        | 0.29  | 500  | 0.5648          | -2.4230        | -3.0650          | 0.7066             | 0.6419          | -476.8223      | -405.0594    | -1.7952         | -1.8126       |
| 0.564         | 0.35  | 600  | 0.5578          | -2.3499        | -2.9909          | 0.7192             | 0.6409          | -469.4118      | -397.7481    | -1.8836         | -1.8847       |
| 0.5769        | 0.4   | 700  | 0.5598          | -2.2030        | -2.8326          | 0.7032             | 0.6296          | -453.5823      | -383.0532    | -1.7859         | -1.7719       |
| 0.5598        | 0.46  | 800  | 0.5586          | -2.2442        | -2.8471          | 0.7163             | 0.6029          | -455.0379      | -387.1816    | -1.7087         | -1.7061       |
| 0.5374        | 0.52  | 900  | 0.5555          | -2.1983        | -2.8363          | 0.7152             | 0.6379          | -453.9529      | -382.5883    | -1.6598         | -1.6767       |
| 0.5036        | 0.58  | 1000 | 0.5499          | -2.2315        | -2.9217          | 0.7209             | 0.6902          | -462.5011      | -385.9115    | -1.7160         | -1.7254       |
| 0.5281        | 0.63  | 1100 | 0.5489          | -2.2855        | -2.9604          | 0.7237             | 0.6749          | -466.3712      | -391.3100    | -1.7504         | -1.7563       |
| 0.5067        | 0.69  | 1200 | 0.5448          | -2.3024        | -3.0075          | 0.7243             | 0.7051          | -471.0760      | -393.0004    | -1.7967         | -1.8047       |
| 0.5095        | 0.75  | 1300 | 0.5451          | -2.2081        | -2.9003          | 0.7186             | 0.6922          | -460.3614      | -383.5680    | -1.8238         | -1.8248       |
| 0.5265        | 0.81  | 1400 | 0.5436          | -2.2844        | -2.9947          | 0.7215             | 0.7103          | -469.7991      | -391.1993    | -1.7998         | -1.8071       |
| 0.4844        | 0.86  | 1500 | 0.5433          | -2.2422        | -2.9533          | 0.7197             | 0.7112          | -465.6614      | -386.9761    | -1.7888         | -1.7993       |
| 0.5612        | 0.92  | 1600 | 0.5427          | -2.2671        | -2.9948          | 0.7209             | 0.7278          | -469.8113      | -389.4626    | -1.7766         | -1.7876       |
| 0.5017        | 0.98  | 1700 | 0.5426          | -2.2658        | -2.9947          | 0.7215             | 0.7289          | -469.7990      | -389.3405    | -1.7796         | -1.7904       |


### Framework versions

- Transformers 4.39.0.dev0
- Pytorch 2.3.0+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2