Update README.md
Browse files
README.md
CHANGED
@@ -134,7 +134,7 @@ The dataset is comprised of a mixture of open datasets large-scale datasets avai
|
|
134 |
* **Code Base**: We use our internal script for SFT steps and used [HuggingFace Alignment Handbook script](https://github.com/huggingface/alignment-handbook) for DPO training.
|
135 |
|
136 |
## Commitment to Ethical AI
|
137 |
-
In line with our responsibility towards ethical AI development, `StableLM Zephyr
|
138 |
* **Self-Harm Methods**: (Suicide Methods, Encouragement of Self-Harm, Methods and encouragement of Eating Disorders)
|
139 |
* **Misinformation**: (Health, Conspiracy Theories, Social Unrest/Conflict, Political Misinformation, & Climate change)
|
140 |
* **Hate Speech**: (Race, Stereotypes, Immigrants, Gender, Personally Identifiable Information such as Social security numbers, Full names, ID numbers, Email addresses, and telephone numbers)
|
|
|
134 |
* **Code Base**: We use our internal script for SFT steps and used [HuggingFace Alignment Handbook script](https://github.com/huggingface/alignment-handbook) for DPO training.
|
135 |
|
136 |
## Commitment to Ethical AI
|
137 |
+
In line with our responsibility towards ethical AI development, `StableLM 2 Zephyr 1.6B` is released with a focus on ensuring safety, reliability, and appropriateness in its applications. To this end, we have evaluated `StableLM Zephyr 3B` on 488 malicious prompts and used standard protocols to assess the harmfulness of its outputs. Compared to Zephyr-7b-β, `StableLM Zephyr 3B` reduces the number of harmful outputs as assessed by GPT-4 by 55. Additionally, we performed an internal red teaming event targeting the following abuse areas:
|
138 |
* **Self-Harm Methods**: (Suicide Methods, Encouragement of Self-Harm, Methods and encouragement of Eating Disorders)
|
139 |
* **Misinformation**: (Health, Conspiracy Theories, Social Unrest/Conflict, Political Misinformation, & Climate change)
|
140 |
* **Hate Speech**: (Race, Stereotypes, Immigrants, Gender, Personally Identifiable Information such as Social security numbers, Full names, ID numbers, Email addresses, and telephone numbers)
|