DRXD1000 commited on
Commit
2b5a943
1 Parent(s): 98e6f76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -7
README.md CHANGED
@@ -20,13 +20,12 @@ tags:
20
 
21
  # Model Card for Phoenix
22
 
23
-
24
  **Phoenix** is a model trained using Direct Preference Optimization (DPO) for the german language. Its training procedure follows the process of the alignment-handbook from Huggingface.
25
  In contrast to zephyr and notus this model has been trained using german instruction and dpo data. In detail, a german translation of HuggingFaceH4/ultrachat_200k
26
  and HuggingFaceH4/ultrafeedback_binarized were created in addition to a series of allready available instruction datasets. The LLM haoranxu/ALMA-13B was used for this.
27
  While the mistral model performs really well, it is not really suitable for the german language. Therefore we have used the fantastic LeoLM/leo-mistral-hessianai-7b.
28
  Thanks to the new type of training, Phoenix is not only able to compete with the Mistral model from LeoLM but also **beats the Llama-70b-chat model in 2 mt-bench categories**.
29
- This model **wouldn't have been possible without the amazing work of Huggingface, LeoLM, openbnb, Argilla the Alma-Team and many others of the AI community**.
30
  i would like to personally thank all AI researchers who make the training of such models possible
31
 
32
  ## MT-Bench-DE Scores
@@ -72,7 +71,7 @@ Florian Leurer compared Phoenix to other LLMs. Check it out here:
72
  ### Model Sources
73
 
74
  - **Repository:** -
75
- - **Paper:** in progress
76
  - **Demo:** -
77
 
78
  ## Training Details
@@ -116,8 +115,8 @@ You will first need to install `transformers` and `accelerate` (just to ease the
116
  ```python
117
  import torch
118
  from transformers import AutoModelForCausalLM, AutoTokenizer
119
- model = AutoModelForCausalLM.from_pretrained("DRXD1000/Phoenix-AWQ", torch_dtype=torch.bfloat16, device_map="auto")
120
- tokenizer = AutoTokenizer.from_pretrained("DRXD1000/Phoenix-AWQ")
121
  prompt = """<|system|>
122
  </s>
123
  <|user|>
@@ -131,9 +130,9 @@ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
131
 
132
  ## Ethical Considerations and Limitations
133
 
134
- As with all LLMs, the potential outputs of `DRXD1000/Phoenix-AWQ` cannot be predicted
135
  in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses
136
- to user prompts. Therefore, before deploying any applications of `DRXD1000/Phoenix-AWQ`, developers should
137
  perform safety testing and tuning tailored to their specific applications of the model.
138
  Please see Meta's [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/).
139
 
@@ -144,6 +143,22 @@ Please see Meta's [Responsible Use Guide](https://ai.meta.com/llama/responsible-
144
  ### Training hyperparameters
145
 
146
  The following hyperparameters were used during training:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  - learning_rate: 5e-07
148
  - train_batch_size: 8
149
  - eval_batch_size: 4
@@ -157,6 +172,18 @@ The following hyperparameters were used during training:
157
  - lr_scheduler_warmup_ratio: 0.1
158
  - num_epochs: 1
159
 
 
 
 
 
 
 
 
 
 
 
 
 
160
 
161
  ### Framework versions
162
 
 
20
 
21
  # Model Card for Phoenix
22
 
 
23
  **Phoenix** is a model trained using Direct Preference Optimization (DPO) for the german language. Its training procedure follows the process of the alignment-handbook from Huggingface.
24
  In contrast to zephyr and notus this model has been trained using german instruction and dpo data. In detail, a german translation of HuggingFaceH4/ultrachat_200k
25
  and HuggingFaceH4/ultrafeedback_binarized were created in addition to a series of allready available instruction datasets. The LLM haoranxu/ALMA-13B was used for this.
26
  While the mistral model performs really well, it is not really suitable for the german language. Therefore we have used the fantastic LeoLM/leo-mistral-hessianai-7b.
27
  Thanks to the new type of training, Phoenix is not only able to compete with the Mistral model from LeoLM but also **beats the Llama-70b-chat model in 2 mt-bench categories**.
28
+ This model **wouldn't have been possible without the amazing work of Huggingface, LeoLM, openbnb, argilla, the Alma-Team and many others of the AI community**.
29
  i would like to personally thank all AI researchers who make the training of such models possible
30
 
31
  ## MT-Bench-DE Scores
 
71
  ### Model Sources
72
 
73
  - **Repository:** -
74
+ - **Paper:** [`PHOENIX: Open-Source Language Adaption for Direct Preference Optimization`](https://arxiv.org/abs/2401.10580)
75
  - **Demo:** -
76
 
77
  ## Training Details
 
115
  ```python
116
  import torch
117
  from transformers import AutoModelForCausalLM, AutoTokenizer
118
+ model = AutoModelForCausalLM.from_pretrained("DRXD1000/Phoenix", torch_dtype=torch.bfloat16, device_map="auto")
119
+ tokenizer = AutoTokenizer.from_pretrained("DRXD1000/Phoenix")
120
  prompt = """<|system|>
121
  </s>
122
  <|user|>
 
130
 
131
  ## Ethical Considerations and Limitations
132
 
133
+ As with all LLMs, the potential outputs of `DRXD1000/Phoenix` cannot be predicted
134
  in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses
135
+ to user prompts. Therefore, before deploying any applications of `DRXD1000/Phoenix`, developers should
136
  perform safety testing and tuning tailored to their specific applications of the model.
137
  Please see Meta's [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/).
138
 
 
143
  ### Training hyperparameters
144
 
145
  The following hyperparameters were used during training:
146
+
147
+ #### SFT Training
148
+ - learning_rate: 2e-05
149
+ - train_batch_size: 32
150
+ - eval_batch_size: 16
151
+ - seed: 42
152
+ - distributed_type: multi-GPU
153
+ - num_devices: 8
154
+ - gradient_accumulation_steps: 2
155
+ - total_train_batch_size: 512
156
+ - total_eval_batch_size: 128
157
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
158
+ - lr_scheduler_type: cosine
159
+ - num_epochs: 1
160
+
161
+ #### DPO Training
162
  - learning_rate: 5e-07
163
  - train_batch_size: 8
164
  - eval_batch_size: 4
 
172
  - lr_scheduler_warmup_ratio: 0.1
173
  - num_epochs: 1
174
 
175
+ ### Citation
176
+ ```
177
+ @misc{uhlig2024phoenix,
178
+ title={PHOENIX: Open-Source Language Adaption for Direct Preference Optimization},
179
+ author={Matthias Uhlig and Sigurd Schacht and Sudarshan Kamath Barkur},
180
+ year={2024},
181
+ eprint={2401.10580},
182
+ archivePrefix={arXiv},
183
+ primaryClass={cs.CL}
184
+ }
185
+ ```
186
+
187
 
188
  ### Framework versions
189