trl-lib
/

Qwen2-0.5B-OnlineDPO

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Qwen2-0.5B-OnlineDPO / vocab.json

qgallouedec's picture

qgallouedec HF staff

trl-lib/ultrafeedback-prompt

f7dcc37 verified 29 days ago

history contribute delete

2.78 MB

File too large to display, you can check the raw version instead.