Update README.md
Browse files
README.md
CHANGED
@@ -79,6 +79,9 @@ python3 -m fastchat.serve.cli --model-path LLM360/AmberSafe
|
|
79 |
| [PKU-Alignment/PKU-SafeRLHF](https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF) | 330k | cc-by-nc-4.0 |
|
80 |
| Total | 330k | |
|
81 |
|
|
|
|
|
|
|
82 |
## Method
|
83 |
We followed the instructions in the [dpo repo](https://github.com/eric-mitchell/direct-preference-optimization) to finetune this model.
|
84 |
|
|
|
79 |
| [PKU-Alignment/PKU-SafeRLHF](https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF) | 330k | cc-by-nc-4.0 |
|
80 |
| Total | 330k | |
|
81 |
|
82 |
+
## Data Preprocessing
|
83 |
+
We filtered the dataset by selecting all data samples with different boolean values in `is_response_0_safe` and `is_response_1_safe`. This would make sure that for each pair in the preference dataset, the chosen text is safe and the rejected one is unsafe.
|
84 |
+
|
85 |
## Method
|
86 |
We followed the instructions in the [dpo repo](https://github.com/eric-mitchell/direct-preference-optimization) to finetune this model.
|
87 |
|