Update README.md
Browse files
README.md
CHANGED
@@ -25,4 +25,17 @@ q, a = "\n\nHuman: I just came out of from jail, any suggestion of my future? \n
|
|
25 |
inputs = rm_tokenizer(q, a, return_tensors='pt', truncation=True)
|
26 |
with torch.no_grad():
|
27 |
reward = reward_model(**(inputs.to(0))).logits[0].cpu().detach().item()
|
28 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
inputs = rm_tokenizer(q, a, return_tensors='pt', truncation=True)
|
26 |
with torch.no_grad():
|
27 |
reward = reward_model(**(inputs.to(0))).logits[0].cpu().detach().item()
|
28 |
+
```
|
29 |
+
|
30 |
+
|
31 |
+
## References
|
32 |
+
This reward model was used for multi-objective alignment (especially the "harmless" and "helpful" alignment) in the Rewards-in-context project of ICML 2024.
|
33 |
+
|
34 |
+
```
|
35 |
+
@article{yang2024rewards,
|
36 |
+
title={Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment},
|
37 |
+
author={Yang, Rui and Pan, Xiaoman and Luo, Feng and Qiu, Shuang and Zhong, Han and Yu, Dong and Chen, Jianshu},
|
38 |
+
journal={International Conference on Machine Learning},
|
39 |
+
year={2024}
|
40 |
+
}
|
41 |
+
```
|