Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
blai88
/
reward_modeling_anthropic_hh
like
0
PEFT
Safetensors
opt
trl
reward-trainer
Generated from Trainer
License:
llama2
Model card
Files
Files and versions
Community
Use this model
main
reward_modeling_anthropic_hh
/
README.md
Commit History
End of training
c6a2310
verified
blai88
commited on
Jul 6
End of training
39ad302
verified
blai88
commited on
Jul 6