zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo / model-00003-of-00003.safetensors

Commit History

Model save
9b16a50
verified

sfulay commited on

Model save
4d79451
verified

sfulay commited on

Model save
0a3e251
verified

sfulay commited on

Model save
44bba64
verified

sfulay commited on

Training in progress, step 400
374e156
verified

sfulay commited on

Training in progress, step 300
2c7d48d
verified

sfulay commited on

Training in progress, step 200
133d893
verified

sfulay commited on

Training in progress, step 100
242749d
verified

sfulay commited on