Nguyễn Minh Phúc
DatPySci
AI & ML interests
Reinforcement learning, NLP
Organizations
Collections
1
models
89
DatPySci/llama3-1b_reward_tldr
Text Classification
•
Updated
•
71
DatPySci/EleutherAI_pythia-2.8b-deduped__ipo_pythia-2.8b_beta-0.1__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.05__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.05__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.1__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.1__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__ipo_ipo_pythia-1b_beta-0.03__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__length_IS_ipo_pythia-1b_beta-0.03__tldr
Updated
datasets
12
DatPySci/llama3-1b_synthetic_tldr
Viewer
•
Updated
•
115k
•
14
DatPySci/gpt2-large_synthetic_tldr
Viewer
•
Updated
•
115k
•
7
DatPySci/gpt2-medium_synthetic_tldr
Viewer
•
Updated
•
115k
•
7
DatPySci/gpt2_synthetic_tldr
Viewer
•
Updated
•
115k
•
15
DatPySci/HH-RLHF-preprocessed
Viewer
•
Updated
•
119k
•
43
DatPySci/tldr_preference_dataset
Viewer
•
Updated
•
179k
•
39
DatPySci/tldr_sft_dataset
Viewer
•
Updated
•
130k
•
240
DatPySci/policy_shift_dataset
Viewer
•
Updated
•
150k
•
31
DatPySci/shift_dataset
Viewer
•
Updated
•
156k
•
41
DatPySci/summarize_from_feedback_oai_preprocessing
Viewer
•
Updated
•
179k
•
33