stabilityai
/

stablelm-2-zephyr-1_6b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

pvduy commited on Jan 19

Commit

3950641

•

1 Parent(s): 430cf0d

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -134,7 +134,7 @@ The dataset is comprised of a mixture of open datasets large-scale datasets avai
 * **Code Base**: We use our internal script for SFT steps and used [HuggingFace Alignment Handbook script](https://github.com/huggingface/alignment-handbook) for DPO training.
 ## Commitment to Ethical AI
-In line with our responsibility towards ethical AI development, `StableLM Zephyr 3B` is released with a focus on ensuring safety, reliability, and appropriateness in its applications. To this end, we have evaluated `StableLM Zephyr 3B` on 488 malicious prompts and used standard protocols to assess the harmfulness of its outputs. Compared to Zephyr-7b-β, `StableLM Zephyr 3B` reduces the number of harmful outputs as assessed by GPT-4 by 55. Additionally, we performed an internal red teaming event targeting the following abuse areas:
 * **Self-Harm Methods**: (Suicide Methods, Encouragement of Self-Harm, Methods and encouragement of Eating Disorders)
 * **Misinformation**: (Health, Conspiracy Theories, Social Unrest/Conflict, Political Misinformation, & Climate change)
 * **Hate Speech**: (Race, Stereotypes, Immigrants, Gender,  Personally Identifiable Information such as Social security numbers, Full names, ID numbers, Email addresses, and telephone numbers)

 * **Code Base**: We use our internal script for SFT steps and used [HuggingFace Alignment Handbook script](https://github.com/huggingface/alignment-handbook) for DPO training.
 ## Commitment to Ethical AI
+In line with our responsibility towards ethical AI development, `StableLM 2 Zephyr 1.6B` is released with a focus on ensuring safety, reliability, and appropriateness in its applications. To this end, we have evaluated `StableLM Zephyr 3B` on 488 malicious prompts and used standard protocols to assess the harmfulness of its outputs. Compared to Zephyr-7b-β, `StableLM Zephyr 3B` reduces the number of harmful outputs as assessed by GPT-4 by 55. Additionally, we performed an internal red teaming event targeting the following abuse areas:
 * **Self-Harm Methods**: (Suicide Methods, Encouragement of Self-Harm, Methods and encouragement of Eating Disorders)
 * **Misinformation**: (Health, Conspiracy Theories, Social Unrest/Conflict, Political Misinformation, & Climate change)
 * **Hate Speech**: (Race, Stereotypes, Immigrants, Gender,  Personally Identifiable Information such as Social security numbers, Full names, ID numbers, Email addresses, and telephone numbers)