DRXD1000
/

Phoenix-AWQ

@@ -20,13 +20,12 @@ tags:
 # Model Card for Phoenix
 **Phoenix** is a model trained using Direct Preference Optimization (DPO) for the german language. Its training procedure follows the process of the alignment-handbook from Huggingface.
 In contrast to zephyr and notus this model has been trained using german instruction and dpo data. In detail, a german translation of HuggingFaceH4/ultrachat_200k
 and HuggingFaceH4/ultrafeedback_binarized were created in addition to a series of allready available instruction datasets. The LLM haoranxu/ALMA-13B was used for this.
 While the mistral model performs really well, it is not really suitable for the german language. Therefore we have used the fantastic LeoLM/leo-mistral-hessianai-7b.
 Thanks to the new type of training, Phoenix is not only able to compete with the Mistral model from LeoLM but also **beats the Llama-70b-chat model in 2 mt-bench categories**.
-This model **wouldn't have been possible without the amazing work of Huggingface, LeoLM, openbnb, Argilla the Alma-Team and many others of the AI community**.
 i would like to personally thank all AI researchers who make the training of such models possible
 ## MT-Bench-DE Scores
@@ -72,7 +71,7 @@ Florian Leurer compared Phoenix to other LLMs. Check it out here:
 ### Model Sources
 - **Repository:** -
-- **Paper:** in progress
 - **Demo:** -
 ## Training Details
@@ -116,8 +115,8 @@ You will first need to install `transformers` and `accelerate` (just to ease the
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("DRXD1000/Phoenix-AWQ", torch_dtype=torch.bfloat16, device_map="auto")
-tokenizer = AutoTokenizer.from_pretrained("DRXD1000/Phoenix-AWQ")
 prompt = """<|system|>
 </s>
 <|user|>
@@ -131,9 +130,9 @@ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
 ## Ethical Considerations and Limitations
-As with all LLMs, the potential outputs of `DRXD1000/Phoenix-AWQ` cannot be predicted
 in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses
-to user prompts. Therefore, before deploying any applications of `DRXD1000/Phoenix-AWQ`, developers should
 perform safety testing and tuning tailored to their specific applications of the model.
 Please see Meta's [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/).
@@ -144,6 +143,22 @@ Please see Meta's [Responsible Use Guide](https://ai.meta.com/llama/responsible-
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-07
 - train_batch_size: 8
 - eval_batch_size: 4
@@ -157,6 +172,18 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 1
 ### Framework versions

 # Model Card for Phoenix
 **Phoenix** is a model trained using Direct Preference Optimization (DPO) for the german language. Its training procedure follows the process of the alignment-handbook from Huggingface.
 In contrast to zephyr and notus this model has been trained using german instruction and dpo data. In detail, a german translation of HuggingFaceH4/ultrachat_200k
 and HuggingFaceH4/ultrafeedback_binarized were created in addition to a series of allready available instruction datasets. The LLM haoranxu/ALMA-13B was used for this.
 While the mistral model performs really well, it is not really suitable for the german language. Therefore we have used the fantastic LeoLM/leo-mistral-hessianai-7b.
 Thanks to the new type of training, Phoenix is not only able to compete with the Mistral model from LeoLM but also **beats the Llama-70b-chat model in 2 mt-bench categories**.
+This model **wouldn't have been possible without the amazing work of Huggingface, LeoLM, openbnb, argilla, the Alma-Team and many others of the AI community**.
 i would like to personally thank all AI researchers who make the training of such models possible
 ## MT-Bench-DE Scores
 ### Model Sources
 - **Repository:** -
+- **Paper:** [`PHOENIX: Open-Source Language Adaption for Direct Preference Optimization`](https://arxiv.org/abs/2401.10580)
 - **Demo:** -
 ## Training Details
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("DRXD1000/Phoenix", torch_dtype=torch.bfloat16, device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained("DRXD1000/Phoenix")
 prompt = """<|system|>
 </s>
 <|user|>
 ## Ethical Considerations and Limitations
+As with all LLMs, the potential outputs of `DRXD1000/Phoenix` cannot be predicted
 in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses
+to user prompts. Therefore, before deploying any applications of `DRXD1000/Phoenix`, developers should
 perform safety testing and tuning tailored to their specific applications of the model.
 Please see Meta's [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/).
 ### Training hyperparameters
 The following hyperparameters were used during training:
+#### SFT Training
+- learning_rate: 2e-05
+- train_batch_size: 32
+- eval_batch_size: 16
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 512
+- total_eval_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- num_epochs: 1
+#### DPO Training
 - learning_rate: 5e-07
 - train_batch_size: 8
 - eval_batch_size: 4
 - lr_scheduler_warmup_ratio: 0.1
 - num_epochs: 1
+### Citation
+```
+@misc{uhlig2024phoenix,
+      title={PHOENIX: Open-Source Language Adaption for Direct Preference Optimization},
+      author={Matthias Uhlig and Sigurd Schacht and Sudarshan Kamath Barkur},
+      year={2024},
+      eprint={2401.10580},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
 ### Framework versions