argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 8.54k • 125
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_helpfulness Viewer • Updated Jun 12 • 60.9k • 43
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_truthfulness Viewer • Updated Jun 12 • 60.9k • 42
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_instruction_following Viewer • Updated Jun 12 • 60.9k • 42 • 3
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_honesty Viewer • Updated Jun 12 • 60.9k • 33
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia6.9b Viewer • Updated Jun 20 • 177k • 53
yaswanthchittepu/ultrafeedback-binarized-standard-margin-data-full Viewer • Updated Jul 7 • 63.7k • 43
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia1b Viewer • Updated May 16 • 177k • 42
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887192 Viewer • Updated Feb 2 • 405 • 48
argilla/ultrafeedback-multi-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 158k • 197 • 6
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 22 • 568k • 83
ShenaoZ/0.001_3iters_bs128_declr_nodpo_zephyrbeta_userresponse_dataset Viewer • Updated Apr 26 • 67.1k • 36
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1707245027 Viewer • Updated Feb 7 • 1M • 100
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_lora-sft-finetuned-stage4-iter86000 Viewer • Updated May 22 • 20.8k • 34
giux78/50000-60900-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17 • 10.9k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3 Viewer • Updated Jun 18 • 21.1k • 35
alvarobartt/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 20, 2023 • 155k • 37
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11 • 2.83k • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 21 • 568k • 468
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_1 Viewer • Updated Mar 21 • 568k • 96
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 22 • 568k • 302
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 23 • 568k • 96
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_1 Viewer • Updated Mar 25 • 189k • 46
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 568k • 103
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 58
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 25 • 189k • 41
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 26 • 94.6k • 44
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.9 Viewer • Updated Mar 26 • 568k • 65
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_0 Viewer • Updated Jun 17 • 5k • 34
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_minpi_part_3 Viewer • Updated Jun 18 • 21.1k • 37
reshinthadith/pairwise-code-review-instruct-critique-revision-python Viewer • Updated Jan 9, 2023 • 5.24k • 69 • 7
NickyNicky/neovalle_H4rmony_dpo_translated_English_to_Spanish Viewer • Updated May 17 • 2.02k • 42 • 4
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330973 Viewer • Updated Feb 7 • 167 • 48
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.2_self_160m Viewer • Updated Mar 14 • 37.9k • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_0.1_self_160m Viewer • Updated Mar 21 • 37.9k • 40
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_1 Viewer • Updated Mar 21 • 568k • 197
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_2 Viewer • Updated Mar 23 • 568k • 121
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_3 Viewer • Updated Mar 21 • 568k • 190
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 21 • 568k • 123
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1 Viewer • Updated Mar 23 • 568k • 170
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 22 • 568k • 119
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 23 • 568k • 123
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3 Viewer • Updated Mar 23 • 568k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 23 • 568k • 92
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 65
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 24 • 189k • 49
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_2 Viewer • Updated Mar 24 • 189k • 62
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 24 • 189k • 54
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 24 • 568k • 171
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 24 • 568k • 249
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 25 • 189k • 50
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 40
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_3 Viewer • Updated Mar 25 • 189k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 117
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.25 Viewer • Updated Mar 26 • 568k • 84
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.9 Viewer • Updated Mar 27 • 568k • 152
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.9 Viewer • Updated Mar 27 • 568k • 97
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 124
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_3 Viewer • Updated May 9 • 4.85k • 36
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_1 Viewer • Updated May 20 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_2 Viewer • Updated May 20 • 5k • 35
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_0 Viewer • Updated May 20 • 5.28k • 31
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized Viewer • Updated Jun 12 • 60.9k • 39
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_0 Viewer • Updated Jun 17 • 5k • 40
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_3 Viewer • Updated Jun 17 • 5.29k • 33
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_3 Viewer • Updated Jun 18 • 5.29k • 32
y1xing/orpo_llama3_concatenated_data_with_chris_examples_orpo_instruct_dataset Viewer • Updated Jul 6 • 2.64k • 36
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 44 • 4
NickyNicky/DIBT_prompts_ranked_En_Es_orpo_dpo_chatML_gemma_V3 Viewer • Updated May 14 • 20.4k • 36 • 1
giux78/10000-20000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 16 • 10k • 56
giux78/20000-50000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17 • 30k • 37
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885434 Viewer • Updated Feb 2 • 24 • 43
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706903049 Viewer • Updated Feb 2 • 167 • 39
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331096 Viewer • Updated Feb 7 • 87 • 64
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331527 Viewer • Updated Feb 7 • 462 • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.2_self_70m Viewer • Updated Mar 14 • 37.9k • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.5_self_160m Viewer • Updated Mar 14 • 37.9k • 56
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_0.3_self_160m Viewer • Updated Mar 21 • 37.9k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_1.0_self_160m Viewer • Updated Mar 21 • 18.9k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_1 Viewer • Updated Mar 21 • 568k • 90
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_2 Viewer • Updated Mar 22 • 568k • 144
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 21 • 568k • 199
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 23 • 568k • 473
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 21 • 568k • 192
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 21 • 568k • 270
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 21 • 568k • 100
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_2 Viewer • Updated Mar 21 • 568k • 130
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_3 Viewer • Updated Mar 22 • 568k • 315
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 24 • 568k • 248
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1 Viewer • Updated Mar 22 • 568k • 248
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1 Viewer • Updated Mar 22 • 568k • 58
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2 Viewer • Updated Mar 22 • 568k • 99
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2 Viewer • Updated Mar 22 • 568k • 76
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 22 • 568k • 198
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3 Viewer • Updated Mar 23 • 568k • 77
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3 Viewer • Updated Mar 23 • 568k • 174
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3 Viewer • Updated Mar 23 • 568k • 79
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 24 • 511k • 135
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_2 Viewer • Updated Mar 24 • 189k • 60
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_3 Viewer • Updated Mar 24 • 189k • 91
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_3 Viewer • Updated Mar 24 • 189k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_3 Viewer • Updated Mar 24 • 189k • 71
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 25 • 189k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 24 • 189k • 54
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_1 Viewer • Updated Mar 25 • 189k • 62
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 189k • 79
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 189k • 64
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 25 • 189k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_3 Viewer • Updated Mar 25 • 189k • 44
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_3 Viewer • Updated Mar 25 • 189k • 40
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 84
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 77
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 73
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 132
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 87
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 72
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 26 • 568k • 145
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.25 Viewer • Updated Mar 26 • 568k • 64
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.75 Viewer • Updated Mar 26 • 568k • 102
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.9 Viewer • Updated Mar 26 • 568k • 76
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.7 Viewer • Updated Mar 27 • 568k • 75
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.5 Viewer • Updated Mar 27 • 568k • 86
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0_eval Viewer • Updated Mar 28 • 568k • 278
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1_t_1.0_eval Viewer • Updated Mar 29 • 568k • 115
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2_t_1.0_eval Viewer • Updated Mar 29 • 568k • 278
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 167
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 281
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7 Viewer • Updated Apr 7 • 20k • 36
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14 • 20k • 37
mnoukhov/summarize_from_feedback_tldr3_labelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 9.5k • 35
ShenaoZhang/0.0001_3iters_bs256_nodpo_full6w_userresponse_dataset Viewer • Updated Apr 29 • 46.8k • 65
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799_new Viewer • Updated May 5 • 20k • 32
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_3 Viewer • Updated May 6 • 4.9k • 32
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_2 Viewer • Updated May 6 • 4.9k • 37
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_3 Viewer • Updated May 6 • 5.19k • 33
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_1 Viewer • Updated May 6 • 5.18k • 32
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_2 Viewer • Updated May 7 • 4.1k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_0 Viewer • Updated May 7 • 4.78k • 32
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_3 Viewer • Updated May 8 • 5.09k • 33
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_2 Viewer • Updated May 8 • 4.4k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_1 Viewer • Updated May 8 • 5k • 32
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_2 Viewer • Updated May 8 • 19.4k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_2 Viewer • Updated May 9 • 4.85k • 34
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_1 Viewer • Updated May 9 • 4.85k • 33
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_0 Viewer • Updated May 9 • 5.16k • 32
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_1 Viewer • Updated May 9 • 5.16k • 41
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr Viewer • Updated May 17 • 107k • 37
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873 Viewer • Updated May 12 • 20k • 40
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_3 Viewer • Updated May 20 • 5k • 34
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_2 Viewer • Updated May 20 • 5k • 35
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_1 Viewer • Updated May 20 • 5.28k • 40
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v2_full-sft-finetuned-stage4-iter86000-v2 Viewer • Updated May 23 • 18.8k • 35
BahaaEldin0/openai_summarize_comparisons_dataset_with_prompts_2_percent Viewer • Updated May 30 • 4.69k • 57
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_2 Viewer • Updated Jun 17 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_1 Viewer • Updated Jun 17 • 5k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_3 Viewer • Updated Jun 17 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_2 Viewer • Updated Jun 17 • 5k • 32
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_0 Viewer • Updated Jun 17 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_2 Viewer • Updated Jun 17 • 5.28k • 33
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_0 Viewer • Updated Jun 18 • 5.28k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_1 Viewer • Updated Jun 18 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_2 Viewer • Updated Jun 18 • 5.28k • 33
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel2_llama8b Viewer • Updated Jun 19 • 92.1k • 36
giux78/ultrafeedback-binarized-preferences-cleaned-ita-ready Viewer • Updated Jan 18 • 60.9k • 52 • 2
NickyNicky/Colossal_Translation_Spanish_to_English_AND_English_to_Spanish_ORPO_DPO_Gemma Viewer • Updated May 6 • 3.4M • 99 • 3
arianhosseini/openai_summarize_comparisons_relabel_pythia1b_iter1_temp0.7 Viewer • Updated Dec 22, 2023 • 20k • 39
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885528 Viewer • Updated Feb 2 • 24 • 54
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706886961 Viewer • Updated Feb 2 • 24 • 40
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887930 Viewer • Updated Feb 2 • 30 • 43
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706893611 Viewer • Updated Feb 2 • 84 • 48
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706896441 Viewer • Updated Feb 2 • 5 • 45
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330518 Viewer • Updated Feb 7 • 167 • 39
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330742 Viewer • Updated Feb 7 • 167 • 41
mnoukhov/openai_summarize_comparisons_tldprompt_relabel_pythia410m-dpo1 Viewer • Updated Feb 19 • 92.5k • 39
mnoukhov/openai_summarize_comparisons_tldrprompt_relabel1b_margin Viewer • Updated Feb 22 • 97.5k • 35
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo Viewer • Updated Feb 26 • 20k • 37
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo Viewer • Updated Feb 26 • 20k • 37
mnoukhov/openai_summarize_generated_20k_relabel_1b_predict_410m-dpo1 Viewer • Updated Feb 26 • 20k • 33
davidberenstein1957/ultrafeedback-binarized-cleaned-and-filtered-random-split Viewer • Updated Mar 14 • 6.69k • 52
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 14 • 37.9k • 40
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.1_self_160m Viewer • Updated Mar 14 • 37.9k • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_3 Viewer • Updated Mar 21 • 568k • 168
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1 Viewer • Updated Mar 22 • 568k • 67
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1 Viewer • Updated Mar 22 • 568k • 114
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2 Viewer • Updated Mar 23 • 568k • 94
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3 Viewer • Updated Mar 23 • 568k • 138
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 23 • 568k • 77
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 24 • 568k • 244
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_1 Viewer • Updated Mar 24 • 189k • 45
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_1 Viewer • Updated Mar 25 • 189k • 39
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_2 Viewer • Updated Mar 25 • 189k • 60
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 100
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0 Viewer • Updated Apr 19 • 568k • 151
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 135
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 80
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 71
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 137
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 98
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 26 • 94.6k • 49
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 26 • 94.6k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.5 Viewer • Updated Mar 26 • 568k • 93
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.9 Viewer • Updated Mar 26 • 568k • 100
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.5 Viewer • Updated Mar 26 • 568k • 126
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.5 Viewer • Updated Mar 26 • 568k • 173
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 27 • 568k • 52
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 27 • 568k • 83
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 27 • 568k • 214
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.1 Viewer • Updated Mar 27 • 568k • 125
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.3 Viewer • Updated Mar 27 • 568k • 111
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.1 Viewer • Updated Mar 27 • 568k • 154
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.3 Viewer • Updated Mar 27 • 568k • 149
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.7 Viewer • Updated Mar 27 • 568k • 154
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.5 Viewer • Updated Mar 27 • 568k • 86
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.9 Viewer • Updated Mar 27 • 568k • 102
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0_eval Viewer • Updated Mar 28 • 568k • 230
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 133
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 240
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 155
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b35a8 Viewer • Updated Apr 16 • 20k • 33
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 18 • 20k • 32
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 107k • 38
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799 Viewer • Updated Apr 22 • 107k • 33
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_0 Viewer • Updated Apr 24 • 10k • 35
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_1 Viewer • Updated Apr 24 • 10k • 33
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_2 Viewer • Updated Apr 24 • 10k • 32
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_3 Viewer • Updated Apr 24 • 10k • 32
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_4 Viewer • Updated Apr 24 • 10k • 33
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_5 Viewer • Updated Apr 24 • 10k • 31
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1 Viewer • Updated Apr 26 • 303k • 114
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3 Viewer • Updated Apr 26 • 303k • 90
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_4 Viewer • Updated Apr 26 • 303k • 62
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_5 Viewer • Updated Apr 26 • 303k • 124
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_4 Viewer • Updated Apr 26 • 303k • 97
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped Viewer • Updated May 2 • 23.3k • 43
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_0 Viewer • Updated May 6 • 4.9k • 39
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_2 Viewer • Updated May 6 • 4.9k • 33
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_1 Viewer • Updated May 6 • 4.9k • 34
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_1 Viewer • Updated May 6 • 4.9k • 33
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_0 Viewer • Updated May 6 • 5.18k • 37
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_2 Viewer • Updated May 6 • 5.18k • 37
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_2 Viewer • Updated May 7 • 4.78k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_3 Viewer • Updated May 7 • 4.78k • 34
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_1 Viewer • Updated May 7 • 4.78k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_0 Viewer • Updated May 8 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_1 Viewer • Updated May 8 • 5.28k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_0 Viewer • Updated May 8 • 5.18k • 36
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_2 Viewer • Updated May 8 • 5.18k • 35
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_3 Viewer • Updated May 8 • 5.19k • 36
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_0 Viewer • Updated May 8 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2 Viewer • Updated May 9 • 19.4k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_2 Viewer • Updated May 9 • 4.98k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_3 Viewer • Updated May 9 • 5.09k • 36
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_1 Viewer • Updated May 9 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_0 Viewer • Updated May 9 • 5.28k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_3 Viewer • Updated May 9 • 20.6k • 32
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_3 Viewer • Updated May 9 • 5.16k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_2 Viewer • Updated May 9 • 5.16k • 39
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3 Viewer • Updated May 9 • 20.6k • 34
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped_add_generated_text Viewer • Updated May 14 • 12k • 88
GENIAC-Team-Ozaki/chatbot-arena-ja-karakuri-lm-8x7b-chat-v0.1-awq Viewer • Updated May 17 • 12.5k • 42
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr_relabel_pythia1b Viewer • Updated May 17 • 107k • 38
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_1 Viewer • Updated May 20 • 5k • 32
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_3 Viewer • Updated May 20 • 5k • 34
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_3 Viewer • Updated May 20 • 5.29k • 33
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_2 Viewer • Updated May 20 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_0 Viewer • Updated May 20 • 5.28k • 31
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_1 Viewer • Updated May 20 • 5.28k • 32
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_3 Viewer • Updated May 20 • 5.29k • 35
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_2 Viewer • Updated May 20 • 5.28k • 33
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v3_full-sft-finetuned-stage4-iter86000-v3 Viewer • Updated May 24 • 19.3k • 34
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v4_full-sft-finetuned-stage4-iter86000-v4 Viewer • Updated May 25 • 19.5k • 44
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_3 Viewer • Updated Jun 17 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_1 Viewer • Updated Jun 17 • 5k • 33
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_llama8b Viewer • Updated Jun 19 • 176k • 35
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706888126 Viewer • Updated Feb 2 • 84 • 40
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__temp Viewer • Updated Feb 6 • 600k • 49
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 14 • 37.9k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 21 • 568k • 125
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 21 • 568k • 144
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 22 • 568k • 115
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2 Viewer • Updated Mar 22 • 568k • 99
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2 Viewer • Updated Mar 22 • 568k • 141
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 24 • 568k • 138
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 57
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 25 • 189k • 39
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0 Viewer • Updated Apr 19 • 568k • 231
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 164
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 70
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 129
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.75 Viewer • Updated Mar 26 • 568k • 131
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.3 Viewer • Updated Mar 27 • 568k • 140
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.5 Viewer • Updated Mar 27 • 568k • 121
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 311
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 116
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 478
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7 Viewer • Updated Apr 8 • 20k • 33
ShenaoZ/0.001_4iters_bs256_nodpo_only2third_userresponse_dataset Viewer • Updated Apr 26 • 12.2k • 40
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_3 Viewer • Updated May 6 • 4.9k • 38
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_0 Viewer • Updated May 20 • 5k • 33
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_1 Viewer • Updated Jun 17 • 5.28k • 33
mnoukhov/openai_summarize_generated_20k_relabel_pythia410m-dpo1_margin Viewer • Updated Feb 22 • 20k • 61
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_v4 Viewer • Updated Mar 11 • 2.83k • 35
aengusl/noise5_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11 • 2.83k • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 80
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.75 Viewer • Updated Mar 26 • 568k • 129
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.25 Viewer • Updated Mar 26 • 568k • 106
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.1 Viewer • Updated Mar 27 • 568k • 66
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.7 Viewer • Updated Mar 27 • 568k • 109
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0_eval Viewer • Updated Mar 28 • 568k • 141
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14 • 20k • 37
mnoukhov/summarize_from_feedback_tldr3_labelled_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 9.5k • 31
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo2_costa_1b_fp16.yml_bfcef Viewer • Updated Apr 21 • 107k • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2 Viewer • Updated Apr 26 • 303k • 68
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_0 Viewer • Updated May 6 • 4.9k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_2 Viewer • Updated May 8 • 5.08k • 34
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_1 Viewer • Updated May 8 • 5.18k • 33
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_3 Viewer • Updated May 8 • 5k • 33
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873_relabel_pythia1b Viewer • Updated May 13 • 20k • 40
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_0 Viewer • Updated May 20 • 5k • 32
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_full-sft-finetuned-stage4-iter86000 Viewer • Updated May 22 • 20.3k • 31
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.2_self_70m Viewer • Updated Mar 15 • 37.9k • 78
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 18 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 18 • 189k • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.1_self_160m Updated Mar 21 • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.5_self_160m Updated Mar 18 • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.2_self_160m Viewer • Updated Mar 15 • 37.9k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.0_self_70m Viewer • Updated Mar 18 • 189k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.0_self_160m Viewer • Updated Mar 18 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 19 • 189k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 19 • 189k • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.0_self_70m Viewer • Updated Mar 19 • 189k • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.1_self_160m Updated Mar 19 • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.5_self_160m Updated Mar 19 • 32
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.0_self_160m Updated Mar 19 • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.3_self_160m Updated Mar 21 • 37
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_1.0_self_160m Updated Mar 21 • 33
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0 Viewer • Updated Apr 19 • 568k • 33
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3_t_1.0_eval Viewer • Updated Mar 29 • 568k • 31
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_0 Viewer • Updated May 9 • 4.85k • 32
ContextualAI/ultrabin_clean_max_chosen_rand_rejected_rationalized Viewer • Updated Jun 12 • 60.9k • 36