Rui Yang's picture

9 7 17

Rui Yang

Ray2333

·

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

updated a model about 21 hours ago

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

updated a collection 3 days ago

upvoted a collection 10 days ago

Papers - Math - Reasoning

View all activity

Organizations

Ray2333's activity

updated a model about 21 hours ago

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated about 21 hours ago • 37

updated a collection 3 days ago

GRM

Generalizable Reward Models • 11 items • Updated 3 days ago • 3

upvoted 2 collections 10 days ago

Papers - Math - Reasoning

11 items • Updated 18 days ago • 1

Papers - Benchmarks - Math

4 items • Updated 23 days ago • 1

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 13 days ago

Model Size

#1 opened 13 days ago by

updated a model 19 days ago

Ray2333/GRM-Llama3.2-3B-rewardmodel-ft

Text Classification • Updated 19 days ago • 2.2k • 2

authored a paper 23 days ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published about 1 month ago • 15

commented a paper 24 days ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published about 1 month ago • 15 •

updated a dataset 24 days ago

DynaMath/DynaMath_Sample

Viewer • Updated 24 days ago • 5.01k • 460 • 6

upvoted a paper 24 days ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published about 1 month ago • 15

updated a Space 29 days ago

README

liked a dataset about 1 month ago

DynaMath/DynaMath_Sample

Viewer • Updated 24 days ago • 5.01k • 460 • 6

liked a Space about 1 month ago

Preference Proxy Evaluations

Preference Proxy Evaluations

New activity in Ray2333/GRM-llama3-8B-sftreg about 1 month ago

Adding `safetensors` variant of this model

#3 opened about 1 month ago by

updated a collection about 2 months ago

GRM

Generalizable Reward Models • 11 items • Updated 3 days ago • 3

liked a model 2 months ago

Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated Sep 17 • 588 • 1

updated a model 2 months ago

Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated Sep 17 • 588 • 1

liked a model 3 months ago

Ray2333/Gemma-2B-rewardmodel-ft

Updated Sep 13 • 458 • 1

updated a model 3 months ago

Ray2333/Gemma-2B-rewardmodel-ft

Updated Sep 13 • 458 • 1