microsoft
/

Phi-3.5-vision-instruct

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

nguyenbh commited on Aug 20

Commit

0eb7016

•

1 Parent(s): bea9545

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -257,6 +257,19 @@ To understand the capabilities, we compare Phi-3.5-vision with a set of models o
 | Document Intelligence | TextVQA (val) | 72.0 | 66.2 | 68.8 | 67.4 | 70.9 | 70.5 | 64.5 | 75.6 |
 | Object visual presence verification | POPE (test) | 86.1 | 83.3 | 84.2 | 86.1 | 83.6 | 76.6 | 89.3 | 87.0 |
 ## Software
 * [PyTorch](https://github.com/pytorch/pytorch)

 | Document Intelligence | TextVQA (val) | 72.0 | 66.2 | 68.8 | 67.4 | 70.9 | 70.5 | 64.5 | 75.6 |
 | Object visual presence verification | POPE (test) | 86.1 | 83.3 | 84.2 | 86.1 | 83.6 | 76.6 | 89.3 | 87.0 |
+## Safety Evaluation and Red-Teaming
+**Approach**
+The Phi-3 family of models has adopted a robust safety post-training approach. This approach leverages a variety of both open-source and in-house generated datasets.
+The overall technique employed to do the safety alignment is a combination of SFT (Supervised Fine-Tuning) and RLHF (Reinforcement Learning from Human Feedback) approaches
+by utilizing human-labeled and synthetic English-language datasets, including publicly available datasets focusing on helpfulness and harmlessness as well as various
+questions and answers targeted to multiple safety categories.
+**Safety Evaluation**
+We leveraged various evaluation techniques including red teaming, adversarial conversation simulations, and safety evaluation benchmark datasets to evaluate Phi-3.5
+models' propensity to produce undesirable outputs across multiple risk categories. Several approaches were used to compensate for the limitations of one approach alone.
+Please refer to the [technical report](https://arxiv.org/pdf/2404.14219) for more details of our safety alignment.
 ## Software
 * [PyTorch](https://github.com/pytorch/pytorch)