54 26 11

ben burtenshaw

burtenshaw

AI & ML interests

None yet

Recent Activity

Reacted to their post with ❤️ about 7 hours ago

For anyone looking to boost their LLM fine-tuning and alignment skills this decemeber. We're running this free and open course called smol course. It’s not big like Li Yin and @mlabonne, it’s just smol. 👷 It focuses on practical use cases, so if you’re working on something, bring it along. 👯‍♀️ It’s peer reviewed and open so you can discuss and get feedback. 🤘 If you’re already a smol pro, feel free to drop a star or issue. > > Part 1 starts now, and it’s on instruction tuning! https://github.com/huggingface/smol-course

posted an update about 7 hours ago

upvoted an article 3 days ago

Use Models from the Hugging Face Hub in LM Studio

View all activity

Articles

Argilla 2.4: Easily Build Fine-Tuning and Evaluation datasets on the Hub — No Code Required

30 days ago

• 41

How to build a custom text classifier without days of human labeling

Oct 17

• 55

How to optimize your data labelling project with custom interfaces

Oct 16

• 18

⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2

Jun 3

• 26

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

Apr 29

• 29

Organizations

burtenshaw's activity

upvoted an article 3 days ago

Article

Use Models from the Hugging Face Hub in LM Studio

•

5 days ago

• 81

upvoted an article 5 days ago

Article

To what extent are we responsible for our content and how to create safer Spaces?

•

Aug 30

• 3

upvoted an article 12 days ago

Article

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

•

12 days ago

• 34

upvoted an article about 1 month ago

Article

How to optimize your data labelling project with custom interfaces

•

Oct 16

• 18

upvoted 3 articles about 2 months ago

Article

How to build a custom text classifier without days of human labeling

•

Oct 17

• 55

Article

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

•

Oct 14

• 56

Article

Recoloring photos with diffusers

•

Oct 9

• 28

upvoted 2 papers 2 months ago

Making Text Embedders Few-Shot Learners

Paper • 2409.15700 • Published Sep 24 • 29

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 49

upvoted an article 3 months ago

Article

Selective fine-tuning of Language Models with Spectrum

•

Sep 3

• 29

upvoted a paper 3 months ago

The Future of Open Human Feedback

Paper • 2408.16961 • Published Aug 15 • 20

upvoted a collection 3 months ago

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59

upvoted a paper 3 months ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 253

upvoted a paper 4 months ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 97

upvoted 2 articles 4 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12

• 103

Article

🔥 Argilla 2.0: the data-centric tool for AI makers 🤗

•

Jul 30

• 37

upvoted an article 5 months ago

Article

How we leveraged distilabel to create an Argilla 2.0 Chatbot

Jul 16

• 32

upvoted a paper 5 months ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28 • 95

upvoted a paper 6 months ago

Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

Paper • 2406.09170 • Published Jun 13 • 24

upvoted an article 6 months ago

Article

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

•

Jun 4

• 73

ben burtenshaw

AI & ML interests

Recent Activity

Articles

Let’s make a generation of amazing image generation models

Zero to Hero with the TRL learning link bomb 💣

Low Code Large Language Model Alignment

Argilla 2.4: Easily Build Fine-Tuning and Evaluation datasets on the Hub — No Code Required

How to build a custom text classifier without days of human labeling

How to optimize your data labelling project with custom interfaces

⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

Organizations

burtenshaw's activity

Use Models from the Hugging Face Hub in LM Studio

To what extent are we responsible for our content and how to create safer Spaces?

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

How to optimize your data labelling project with custom interfaces

How to build a custom text classifier without days of human labeling

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

Recoloring photos with diffusers

Selective fine-tuning of Language Models with Spectrum

Welcome FalconMamba: The first strong attention-free 7B model

🔥 Argilla 2.0: the data-centric tool for AI makers 🤗

How we leveraged distilabel to create an Argilla 2.0 Chatbot

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets