Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
RLHFlow
's Collections
Standard-format-preference-dataset
Mixture-of-preference-reward-modeling
RM-Bradley-Terry
PM-pair
Online RLHF
RLHFLow Reward Models
SFT Models
SFT Models
updated
14 days ago
We train a series of SFT models on the high-quality SFT dataset of RLHFlow for research purpose.
Upvote
-
RLHFlow/LLaMA3-SFT
Text Generation
•
Updated
May 23
•
5.01k
•
7
sfairXC/gemma-sft-1ep
Text Generation
•
Updated
Aug 30
•
243
sfairXC/gemma-sft-2ep
Text Generation
•
Updated
Aug 30
•
2
sfairXC/llama-3.1-sft-1ep
Text Generation
•
Updated
16 days ago
•
6
sfairXC/llama-3.1-sft-2ep
Text Generation
•
Updated
16 days ago
•
2
Upvote
-
Share collection
View history
Collection guide
Browse collections