pvduy commited on
Commit
3950641
1 Parent(s): 430cf0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -134,7 +134,7 @@ The dataset is comprised of a mixture of open datasets large-scale datasets avai
134
  * **Code Base**: We use our internal script for SFT steps and used [HuggingFace Alignment Handbook script](https://github.com/huggingface/alignment-handbook) for DPO training.
135
 
136
  ## Commitment to Ethical AI
137
- In line with our responsibility towards ethical AI development, `StableLM Zephyr 3B` is released with a focus on ensuring safety, reliability, and appropriateness in its applications. To this end, we have evaluated `StableLM Zephyr 3B` on 488 malicious prompts and used standard protocols to assess the harmfulness of its outputs. Compared to Zephyr-7b-β, `StableLM Zephyr 3B` reduces the number of harmful outputs as assessed by GPT-4 by 55. Additionally, we performed an internal red teaming event targeting the following abuse areas:
138
  * **Self-Harm Methods**: (Suicide Methods, Encouragement of Self-Harm, Methods and encouragement of Eating Disorders)
139
  * **Misinformation**: (Health, Conspiracy Theories, Social Unrest/Conflict, Political Misinformation, & Climate change)
140
  * **Hate Speech**: (Race, Stereotypes, Immigrants, Gender, Personally Identifiable Information such as Social security numbers, Full names, ID numbers, Email addresses, and telephone numbers)
 
134
  * **Code Base**: We use our internal script for SFT steps and used [HuggingFace Alignment Handbook script](https://github.com/huggingface/alignment-handbook) for DPO training.
135
 
136
  ## Commitment to Ethical AI
137
+ In line with our responsibility towards ethical AI development, `StableLM 2 Zephyr 1.6B` is released with a focus on ensuring safety, reliability, and appropriateness in its applications. To this end, we have evaluated `StableLM Zephyr 3B` on 488 malicious prompts and used standard protocols to assess the harmfulness of its outputs. Compared to Zephyr-7b-β, `StableLM Zephyr 3B` reduces the number of harmful outputs as assessed by GPT-4 by 55. Additionally, we performed an internal red teaming event targeting the following abuse areas:
138
  * **Self-Harm Methods**: (Suicide Methods, Encouragement of Self-Harm, Methods and encouragement of Eating Disorders)
139
  * **Misinformation**: (Health, Conspiracy Theories, Social Unrest/Conflict, Political Misinformation, & Climate change)
140
  * **Hate Speech**: (Race, Stereotypes, Immigrants, Gender, Personally Identifiable Information such as Social security numbers, Full names, ID numbers, Email addresses, and telephone numbers)