akifumiwachi
commited on
Commit
•
b596248
1
Parent(s):
15ddeb1
Update README.md
Browse files
README.md
CHANGED
@@ -31,6 +31,7 @@ tags:
|
|
31 |
- **Fine-tuned from model:** [Alpaca (reprod.)](https://huggingface.co/PKU-Alignment/alpaca-7b-reproduced) (reproduced version of [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca))
|
32 |
- **Dataset:** [PKU-SafeRLHF-30K](https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF-30K)
|
33 |
- **SACPO Paper:** <https://arxiv.org/abs/2404.11049>
|
|
|
34 |
- **Model Alias:** SACPO: DPO (H) -> DPO (S) 0.025
|
35 |
|
36 |
## Usage: How to Talk with the Model
|
|
|
31 |
- **Fine-tuned from model:** [Alpaca (reprod.)](https://huggingface.co/PKU-Alignment/alpaca-7b-reproduced) (reproduced version of [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca))
|
32 |
- **Dataset:** [PKU-SafeRLHF-30K](https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF-30K)
|
33 |
- **SACPO Paper:** <https://arxiv.org/abs/2404.11049>
|
34 |
+
- **GitHub:** <https://github.com/line/sacpo>
|
35 |
- **Model Alias:** SACPO: DPO (H) -> DPO (S) 0.025
|
36 |
|
37 |
## Usage: How to Talk with the Model
|