Lysandre's picture

Lysandre

lysandre

·

http://lysand.re

AI & ML interests

chief open-source officer @ hf

Recent Activity

reacted to ArthurZ's post with 🔥 2 days ago

updated a dataset 3 days ago

huggingface/transformers-metadata

Articles

Fixing Gradient Accumulation

License to Call: Introducing Transformers Agents 2.0

We are hiring interns!

Hugging Face on PyTorch / XLA TPUs

Organizations

lysandre's activity

upvoted a paper 17 days ago

Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis

Paper • 2410.23320 • Published 22 days ago • 6

upvoted an article 30 days ago

Article

Transformers.js v3: WebGPU support, new models & tasks, and more…

about 1 month ago

• 62

upvoted an article about 1 month ago

Article

Tool Use, Unified

Aug 12

• 64

upvoted a collection 2 months ago

Llama3-8B-1.58

A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated Sep 14 • 12

upvoted an article 2 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 202

upvoted 3 articles 3 months ago

Article

Don't repeat yourself - 🤗 Transformers Design Philosophy

Apr 5, 2022

• 12

Article

MobileNet Baselines

By

•

Jul 26

• 23

Article

MobileNet-V4 (now in timm)

By

•

Jun 17

• 39

upvoted an article 4 months ago

Article

WWDC 24: Running Mistral 7B with Core ML

Jul 22

• 55

upvoted a collection 5 months ago

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 19 days ago • 158

upvoted a collection 6 months ago

Embedding Model Datasets

A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 67 items • Updated Jul 3 • 76

upvoted an article 6 months ago

Article

License to Call: Introducing Transformers Agents 2.0

May 13

• 116

upvoted an article 7 months ago

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24

• 59

upvoted a collection 8 months ago

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted a collection 9 months ago

Canonical models

This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace • 68 items • Updated Feb 13 • 13

upvoted a collection 10 months ago

SigLIP

Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated 3 days ago • 37

upvoted a paper 12 months ago

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118

upvoted a collection 12 months ago

Switch-Transformers release

This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 31 • 15

upvoted 2 collections about 1 year ago

zephyr story

sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated Jan 24 • 15

Distil-Whisper Models

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 36