MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 6 items • Updated 6 days ago • 85
Probably function calling datasets Collection Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17 • 36
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 177
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 124
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 602
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 124
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Sep 18 • 206
Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5 • 12
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation Paper • 2312.17276 • Published Dec 27, 2023 • 15
MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices Paper • 2312.16886 • Published Dec 28, 2023 • 19
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 85
Distil-Whisper Models Collection The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 35
Text to Music 🎧 Collection A collection of music generation models supported in 🤗 Transformers and 🧨 Diffusers • 5 items • Updated Sep 16, 2023 • 2