hendrydong commited on
Commit
87cf99e
1 Parent(s): 2e78e24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -5
README.md CHANGED
@@ -178,10 +178,28 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
178
 
179
  [More Information Needed]
180
 
181
- ## Model Card Authors [optional]
182
-
183
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
 
185
- ## Model Card Contact
186
 
187
- [More Information Needed]
 
178
 
179
  [More Information Needed]
180
 
181
+ ## References
182
+
183
+ If you found this helpful, please cite the following papers.
184
+
185
+ ```bibtex
186
+ @article{dong2023raft,
187
+ title={Raft: Reward ranked finetuning for generative foundation model alignment},
188
+ author={Dong, Hanze and Xiong, Wei and Goyal, Deepanshu and Pan, Rui and Diao, Shizhe and Zhang, Jipeng and Shum, Kashun and Zhang, Tong},
189
+ journal={arXiv preprint arXiv:2304.06767},
190
+ year={2023}
191
+ }
192
+
193
+ @misc{xiong2024iterative,
194
+ title={Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint},
195
+ author={Wei Xiong and Hanze Dong and Chenlu Ye and Ziqi Wang and Han Zhong and Heng Ji and Nan Jiang and Tong Zhang},
196
+ year={2024},
197
+ eprint={2312.11456},
198
+ archivePrefix={arXiv},
199
+ primaryClass={cs.LG}
200
+ }
201
+ ```
202
 
203
+ ## Contact
204
 
205
+ If you have any questions, please contact hanze dot dong AT salesforce dot com.