Thomas Wolf's picture

Thomas Wolf PRO

thomwolf

·

https://thomwolf.io

AI & ML interests

NLP and open-source :-)

Articles

FineVideo: behind the scenes

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

A failed experiment: Infini-Attention, and why we should keep trying?

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Constitutional AI with Open LLMs

Open LLM Leaderboard: DROP deep dive

What's going on with the Open LLM Leaderboard?

Can foundation models label data like humans?

Organizations

thomwolf's activity

upvoted a paper 9 days ago

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

Paper • 2410.21969 • Published 19 days ago • 8

upvoted a paper 13 days ago

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published 13 days ago • 23

upvoted an article 23 days ago

Article

Breaking Barriers: The Critical Role of Art and Design in Advancing AI Capabilities

By

•

Jan 15

• 2

upvoted a collection about 1 month ago

LoLCATS

Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! • 4 items • Updated Oct 14 • 14

upvoted 2 papers about 1 month ago

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published Sep 12 • 66

EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 24

upvoted 2 papers 2 months ago

Diffusion Policy Policy Optimization

Paper • 2409.00588 • Published Sep 1 • 19

Affordance-based Robot Manipulation with Flow Matching

Paper • 2409.01083 • Published Sep 2 • 18

upvoted an article 3 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 85

upvoted 2 collections 3 months ago

InternLM2.5

14 items • Updated Sep 14 • 70

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 197

upvoted 2 articles 4 months ago

Article

Announcing BigCodeBench-Hard, and More

By

•

Jul 24

• 10

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 263

upvoted a collection 4 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 618

upvoted a paper 4 months ago

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

upvoted 3 papers 5 months ago

OLMES: A Standard for Language Model Evaluations

Paper • 2406.08446 • Published Jun 12 • 2

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17 • 48

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 86

upvoted an article 5 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 177

upvoted a paper 5 months ago

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Paper • 2406.11271 • Published Jun 17 • 19