nonstopfor
commited on
Commit
•
d9d5698
1
Parent(s):
99e90a6
Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ language:
|
|
5 |
- zh
|
6 |
---
|
7 |
## Introduction
|
8 |
-
The ShieldLM model ([paper link](
|
9 |
Refer to our [github repository](https://github.com/thu-coai/ShieldLM) for more detailed information.
|
10 |
|
11 |
## Usage
|
@@ -13,4 +13,4 @@ Please refer to our [github repository](https://github.com/thu-coai/ShieldLM) fo
|
|
13 |
|
14 |
## Performance
|
15 |
ShieldLM demonstrates impressive detection performance across 4 ID and OOD test sets, compared to strong baselines such as GPT-4, Llama Guard and Perspective API.
|
16 |
-
Refer to [our paper](
|
|
|
5 |
- zh
|
6 |
---
|
7 |
## Introduction
|
8 |
+
The ShieldLM model ([paper link](https://arxiv.org/abs/2402.16444)) initialized from [Baichuan2-13B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat). ShieldLM is a bilingual (Chinese and English) safety detector that mainly aims to help to detect safety issues in LLMs' generations. It aligns with general human safety standards, supports fine-grained customizable detection rules, and provides explanations for its decisions.
|
9 |
Refer to our [github repository](https://github.com/thu-coai/ShieldLM) for more detailed information.
|
10 |
|
11 |
## Usage
|
|
|
13 |
|
14 |
## Performance
|
15 |
ShieldLM demonstrates impressive detection performance across 4 ID and OOD test sets, compared to strong baselines such as GPT-4, Llama Guard and Perspective API.
|
16 |
+
Refer to [our paper](https://arxiv.org/abs/2402.16444) for more detailed evaluation results.
|