arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Recent Activity
liked
a dataset
about 9 hours ago
RLHFlow/RLHFlow-SFT-Dataset-ver2
updated
a dataset
9 days ago
weqweasdas/ep1_2
updated
a dataset
9 days ago
weqweasdas/ep1_6
Organizations
models
23
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
13
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
17
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
1
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
1
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
3
weqweasdas/raft_baseline_openchat_llama13b_model1
Text Generation
•
Updated
•
1
datasets
74
weqweasdas/ep1_2
Viewer
•
Updated
•
171k
•
25
weqweasdas/ep1_6
Viewer
•
Updated
•
709k
•
52
weqweasdas/ep1_5
Viewer
•
Updated
•
168k
•
45
weqweasdas/ep1_4
Viewer
•
Updated
•
167k
•
43
weqweasdas/ep1_3
Viewer
•
Updated
•
168k
•
38
weqweasdas/ep1_1
Viewer
•
Updated
•
200k
•
20
weqweasdas/meta_math
Viewer
•
Updated
•
395k
•
44
weqweasdas/DS-MATH
Viewer
•
Updated
•
500
•
18
weqweasdas/MS-MATH
Viewer
•
Updated
•
500
•
18
weqweasdas/hn_mistral_prm_pairwise_only_step2
Viewer
•
Updated
•
138k
•
27