Daniel van Strien PRO

davanstrien

AI & ML interests

Machine Learning Librarian

Articles

Organizations

davanstrien's activity

upvoted an article 3 days ago
view article
Article

Releasing the largest multilingual open pretraining dataset

85
upvoted an article 7 days ago
upvoted 7 articles 25 days ago
view article
Article

Releasing Outlines-core 0.1.0: structured generation in Rust and Python

41
view article
Article

ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models

15
view article
Article

OCR Processing and Text in Image Analysis with DeepSeek Janus-1.3B

2
view article
Article

OCR Processing and Text in Image Analysis with Florence-2-base and Qwen2-VL-2B

13
view article
Article

🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦‍⬛

By anakin87
18
upvoted an article about 1 month ago
view article
Article

How to build a custom text classifier without days of human labeling

By sdiazlor
55
upvoted 3 articles about 1 month ago
view article
Article

Improving Parquet Dedupe on Hugging Face Hub

29
view article
Article

Faster Assisted Generation with Dynamic Speculation

31
view article
Article

Scaling AI-based Data Processing with Hugging Face + Dask

23