zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo / model-00003-of-00003.safetensors

Commit History

Model save
44bba64
verified

sfulay commited on

Training in progress, step 400
374e156
verified

sfulay commited on

Training in progress, step 300
2c7d48d
verified

sfulay commited on

Training in progress, step 200
133d893
verified

sfulay commited on

Training in progress, step 100
242749d
verified

sfulay commited on