Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 11 days ago • 192
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17 • 51
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 174
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Paper • 2312.14238 • Published Dec 21, 2023 • 14
view article Article From cloud to developers: Hugging Face and Microsoft Deepen Collaboration May 21 • 8
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 161
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 56
view article Article Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B Apr 4 • 23
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated 12 days ago • 206
All open-source models available on Workers AI Collection Read Developer Docs to get started https://developers.cloudflare.com/workers-ai/models/ • 38 items • Updated Apr 2 • 5
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 50
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 90
Sora Reference Papers Collection A collection of all papers referenced in OpenAI's "Video generation models as world simulators" technical report • openai.com/sora • 30 items • Updated Feb 20 • 51
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 67 items • Updated Aug 6 • 84
⛔️🔦 Provenance, Watermarking & Deepfake Detection Collection Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 38
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 592
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Paper • 2312.16171 • Published Dec 26, 2023 • 34
Specialized Language Models with Cheap Inference from Limited Domain Data Paper • 2402.01093 • Published Feb 2 • 45
Lumiere: A Space-Time Diffusion Model for Video Generation Paper • 2401.12945 • Published Jan 23 • 86
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace • 68 items • Updated Feb 13 • 13
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16 • 35
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 178
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 257
Gemini: A Family of Highly Capable Multimodal Models Paper • 2312.11805 • Published Dec 19, 2023 • 45
Power Hungry Processing: Watts Driving the Cost of AI Deployment? Paper • 2311.16863 • Published Nov 28, 2023 • 6
Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 144
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 70
GSAP-NER: A Novel Task, Corpus, and Baseline for Scholarly Entity Extraction Focused on Machine Learning Models and Datasets Paper • 2311.09860 • Published Nov 16, 2023 • 5
Nemotron 3 8B Collection The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise. • 5 items • Updated 3 days ago • 43
read papers Collection This is a collection of some papers I've read in the past few months • 10 items • Updated Nov 21, 2023 • 47
3D Gaussian Splatting for Real-Time Radiance Field Rendering Paper • 2308.04079 • Published Aug 8, 2023 • 167
Robust Speech Recognition via Large-Scale Weak Supervision Paper • 2212.04356 • Published Dec 6, 2022 • 17
LLM as a Judge Collection Curated resources that support the use of LLMs to serve as automatic evaluators of other LLM outputs. • 16 items • Updated 12 days ago • 20
Latent Consistency Model Demos Collection Latent Consistency Models for Stable Diffusion • 8 items • Updated Nov 12, 2023 • 25
Latent Consistency Models LoRAs Collection Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 98
Reward models on the hub Collection UNMAINTAINED: See RewardBench... A place to collect reward models, an often not released artifact of RLHF. • 18 items • Updated Apr 13 • 24
SEAHORSE release Collection The SEAHORSE metrics (as described in https://arxiv.org/abs/2305.13194). • 12 items • Updated Jul 31 • 17
zephyr story Collection sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated Jan 24 • 15
OpenChat Collection OpenChat: Advancing Open-source Language Models with Mixed-Quality Data • 7 items • Updated Jul 31 • 33
Papers to read - General Collection Papers I want to read, at some point. • 8 items • Updated Apr 9 • 4
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 144