Nathan Lambert's picture

Nathan Lambert

natolambert

·

https://www.natolambert.com/

AI & ML interests

Reinforcement learning, Ethics, Robotics, Dynamics Models

Recent Activity

liked a model about 10 hours ago

allenai/Llama-3.1-Tulu-3-70B-broken

updated a dataset about 12 hours ago

ai2-adapt-dev/tulu_v3.9_wildchat_100k_english

liked a model about 14 hours ago

lmstudio-community/Llama-3.1-Tulu-3-8B-GGUF

View all activity

Articles

Ethics and Society Newsletter #4: Bias in Text-to-Image Models

Can foundation models label data like humans?

Creating a Coding Assistant with StarCoder

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Red-Teaming Large Language Models

What Makes a Dialog Agent Useful?

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Stable Diffusion with 🧨 Diffusers

Organizations

natolambert's activity

upvoted a collection 3 days ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 1 day ago • 17

upvoted a collection about 2 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated 8 days ago • 273

upvoted a collection 2 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 379

upvoted 3 collections 3 months ago

Skywork-Reward-Data-Collection

Open-source preference datasets used to train the Skywork reward model series • 17 items • Updated Oct 12 • 9

OLMoE

Artifacts for open mixture-of-experts language models. • 13 items • Updated 8 days ago • 25

Hermes 3

The Hermes 3 Series of Models • 8 items • Updated Aug 23 • 91

upvoted a collection 4 months ago

Aligned Diffusion Model via DPO

18 items • Updated Jul 8 • 3

upvoted a collection 5 months ago

Tulu V2.5 Suite

A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more! • 44 items • Updated 8 days ago • 14

upvoted a collection 6 months ago

SciRIFF

Data and models to enhance instruction-following for scientific literature understanding. • 9 items • Updated 8 days ago • 8

upvoted a collection 7 months ago

[lecture artifacts] aligning open language models

artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 56

upvoted a collection 8 months ago

Reward Bench

Datasets, spaces, and models for the reward model benchmark! • 5 items • Updated 8 days ago • 7

upvoted a collection 9 months ago

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted 2 collections 10 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 217

OLMo Suite

Artifacts for the first set of OLMo models. • 18 items • Updated 8 days ago • 66

upvoted a collection 11 months ago

Deita

14 items • Updated May 20 • 9

upvoted 4 collections 12 months ago

Awesome feedback datasets

A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 65

Awesome SFT datasets

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 119

MoE

137 items • Updated Jul 9 • 19

Journal Club

Candidate papers to read in the H4 journal club • 54 items • Updated Apr 21 • 26

upvoted a collection about 1 year ago

Tulu V2 Suite

The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" • 19 items • Updated 8 days ago • 42