Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59
SSMs Collection A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated Oct 1 • 26
RLHF Collection A collection of models trained with Reinforcement Learning from Human Feedback (RLHF). • 4 items • Updated Oct 1 • 4
SteerLM Collection A collection of models and datasets relating to SteerLM and HelpSteer. • 7 items • Updated Oct 1 • 14
HelpSteer2: Open-source dataset for training top-performing reward models Paper • 2406.08673 • Published Jun 12 • 16
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Paper • 2405.01481 • Published May 2 • 25
HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM Paper • 2311.09528 • Published Nov 16, 2023 • 2
RLHF Collection A collection of models trained with Reinforcement Learning from Human Feedback (RLHF). • 4 items • Updated Oct 1 • 4