Nguyễn Minh Phúc
DatPySci
AI & ML interests
Reinforcement learning, NLP
Organizations
Collections
1
models
83
DatPySci/EleutherAI_pythia-2.8b-deduped__dpo_pythia-2.8b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-2.8b-deduped__length_IS_pythia-2.8b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__ipo_ipo_pythia-1b_beta-0.03__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__length_IS_ipo_pythia-1b_beta-0.03__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__ipo_ipo_pythia-1b_beta-0.02__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__length_IS_ipo_pythia-1b_beta-0.02__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__ipo_ipo_pythia-1b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-410m-deduped__length_IS_ipo_pythia-1b_beta-0.01__tldr
Updated
DatPySci/EleutherAI_pythia-1b-deduped__length_IS_ipo_pythia-1b_beta-0.02__tldr
Updated
DatPySci/model_step_72000_tldr_summarization
Text Generation
•
Updated
•
7
datasets
8
DatPySci/HH-RLHF-preprocessed
Viewer
•
Updated
•
119k
•
31
DatPySci/tldr_preference_dataset
Viewer
•
Updated
•
179k
•
82
DatPySci/tldr_sft_dataset
Viewer
•
Updated
•
130k
•
122
DatPySci/policy_shift_dataset
Viewer
•
Updated
•
150k
•
2
DatPySci/shift_dataset
Viewer
•
Updated
•
156k
•
3
DatPySci/summarize_from_feedback_oai_preprocessing
Viewer
•
Updated
•
179k
•
1
DatPySci/anthropic_hh_rlhf_filtered_oai_preprocessing
Viewer
•
Updated
•
169k
•
2
DatPySci/summarize_from_feedback_oai_preprocessing_pythia-6.9b-gold
Viewer
•
Updated
•
115k
•
1.19k