RuiyangSun
commited on
Commit
•
c2f25b2
1
Parent(s):
32e35c1
Update README.md
Browse files
README.md
CHANGED
@@ -35,6 +35,7 @@ It can play a role in the safe RLHF algorithm, helping the Beaver model become m
|
|
35 |
- **Dataset:** <https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF>
|
36 |
- **Reward Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-reward>
|
37 |
- **Cost Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-cost>
|
|
|
38 |
- **Paper:** *Coming soon...*
|
39 |
|
40 |
## How to Use the Cost Model
|
|
|
35 |
- **Dataset:** <https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF>
|
36 |
- **Reward Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-reward>
|
37 |
- **Cost Model:** <https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-cost>
|
38 |
+
- **Dataset Paper:** <https://arxiv.org/abs/2307.04657>
|
39 |
- **Paper:** *Coming soon...*
|
40 |
|
41 |
## How to Use the Cost Model
|