jiaqiz's picture

4 5

jiaqiz

jiaqiz

·

AI & ML interests

None yet

Recent Activity

updated a collection about 2 months ago

View all activity

Organizations

jiaqiz's activity

updated a collection about 2 months ago

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59

updated a collection 5 months ago

SSMs

A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated Oct 1 • 26

New activity in nvidia/Nemotron-4-340B-Base 5 months ago

missing weight file?

#5 opened 5 months ago by

updated a model 5 months ago

nvidia/Nemotron-4-340B-Base

Updated Jun 28 • 297 • 143

updated 2 collections 5 months ago

RLHF

A collection of models trained with Reinforcement Learning from Human Feedback (RLHF). • 4 items • Updated Oct 1 • 4

SteerLM

A collection of models and datasets relating to SteerLM and HelpSteer. • 7 items • Updated Oct 1 • 14

updated a dataset 5 months ago

nvidia/Daring-Anteater

Viewer • Updated Jun 17 • 99.5k • 309 • 19

authored a paper 5 months ago

HelpSteer2: Open-source dataset for training top-performing reward models

Paper • 2406.08673 • Published Jun 12 • 16

liked 3 models 5 months ago

nvidia/Nemotron-4-340B-Base

Updated Jun 28 • 297 • 143

nvidia/Nemotron-4-340B-Reward

Updated Jun 19 • 216 • 111

nvidia/Nemotron-4-340B-Instruct

Updated Jun 24 • 1.49k • 653

liked a dataset 5 months ago

nvidia/HelpSteer2

Viewer • Updated Oct 15 • 21.4k • 12.7k • 374

New activity in nvidia/Nemotron-4-340B-Base 5 months ago

Update README.md

#2 opened 5 months ago by

Update README.md

#1 opened 5 months ago by

authored 3 papers 7 months ago

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

Paper • 2405.01481 • Published May 2 • 25

HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

Paper • 2311.09528 • Published Nov 16, 2023 • 2

MedDialog: Two Large-scale Medical Dialogue Datasets

Paper • 2004.03329 • Published Apr 7, 2020

updated a dataset 9 months ago

nvidia/sft_datablend_v1

Viewer • Updated Mar 9 • 128k • 90 • 13

updated a collection 9 months ago

RLHF

A collection of models trained with Reinforcement Learning from Human Feedback (RLHF). • 4 items • Updated Oct 1 • 4