Ray2333 commited on
Commit
0f56399
1 Parent(s): 1b50cd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -73,7 +73,16 @@ with torch.no_grad():
73
  ```
74
 
75
 
76
- ## To be added ...
 
 
 
 
 
 
 
 
 
77
 
78
 
79
 
 
73
  ```
74
 
75
 
76
+ ## Citation
77
+ This reward model is used as a gold reward model for the following research https://arxiv.org/abs/2406.10216. If you find this model helpful for your research, please cite
78
+ ```
79
+ @article{yang2024regularizing,
80
+ title={Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs},
81
+ author={Yang, Rui and Ding, Ruomeng and Lin, Yong and Zhang, Huan and Zhang, Tong},
82
+ journal={arXiv preprint arXiv:2406.10216},
83
+ year={2024}
84
+ }
85
+ ```
86
 
87
 
88