Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
honggen
/
hard_dpo
like
0
Text Generation
Anthropic/hh-rlhf
English
License:
apache-2.0
Model card
Files
Files and versions
Community
Edit model card
The reference model after supervised fine-tuning on the chosen response.
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference Examples
Text Generation
Unable to determine this model's library. Check the
docs
.
Dataset used to train
honggen/hard_dpo
Anthropic/hh-rlhf
Viewer
•
Updated
May 26, 2023
•
169k
•
8.65k
•
1.2k