David Berenstein's picture

David Berenstein

davidberenstein1957

·

AI & ML interests

Everything NLP and knowledge graphs

Recent Activity

upvoted a collection about 19 hours ago

Dataset Creation

upvoted a collection about 19 hours ago

Dataset Exploration

upvoted a collection about 19 hours ago

Synthetic Dataset Creation

View all activity

Articles

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

How to build a custom text classifier without days of human labeling

How to optimize your data labelling project with custom interfaces

To what extent are we responsible for our content and how to create safer Spaces?

Data Is Better Together: A Look Back and Forward

Organizations

davidberenstein1957's activity

upvoted 4 collections about 19 hours ago

Dataset Creation

Spaces and utilities for creating datasets and getting them on the Hub • 3 items • Updated 13 days ago • 10

Dataset Exploration

3 items • Updated 13 days ago • 3

Synthetic Dataset Creation

Spaces focused on generating synthetic datasets • 5 items • Updated 13 days ago • 7

Dataset transformation, preparation and edition

2 items • Updated about 20 hours ago • 3

upvoted a collection about 23 hours ago

Models for dataset curation

7 items • Updated about 23 hours ago • 14

upvoted an article 2 days ago

Article

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

By

•

2 days ago

• 17

upvoted a collection 3 days ago

Flow-Judge-v0.1

Flow-Judge-v0.1 models • 5 items • Updated Sep 17 • 19

upvoted a collection 4 days ago

Marqo-Ecommerce-Embeddings

State-of-the-art embedding models fine-tuned for the ecommerce domain. +67% increase in evaluation metrics vs ViT-B-16-SigLIP. • 10 items • Updated 9 days ago • 16

upvoted an article 4 days ago

Article

Low Code Large Language Model Alignment

By

•

4 days ago

• 13

upvoted 2 articles about 1 month ago

Article

How to build a custom text classifier without days of human labeling

By

•

Oct 17

• 55

Article

How to optimize your data labelling project with custom interfaces

By

•

Oct 16

• 18

upvoted a collection 2 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 218

upvoted 3 articles 3 months ago

Article

Fine-tuning a token classification model for legal data using Argilla and AutoTrain

By

•

Sep 7

• 14

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 85

Article

Introducing AuraFace: Open-Source Face Recognition and Identity Preservation Models

By

•

Aug 26

• 36

upvoted a collection 3 months ago

Gradio Annotators

It's not for team collaboration, nor trying to be all fancy and formal - just a bunch of cool tools to help you move to a more serious stage. • 14 items • Updated Aug 22 • 3

upvoted an article 3 months ago

Article

dstack: Your LLM Launchpad - From Fine-Tuning to Serving, Simplified

By

•

Aug 22

• 12

upvoted 3 collections 3 months ago

Dataset Viber annotators

17 items • Updated Aug 21 • 1

Hermes 3

The Hermes 3 Series of Models • 8 items • Updated Aug 23 • 91

Probably DPO datasets

A collection of datasets that probably support DPO • 146 items • Updated Jun 26 • 12