Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis Paper β’ 2410.23320 β’ Published 22 days ago β’ 6
view article Article Transformers.js v3: WebGPU support, new models & tasks, and more⦠about 1 month ago ⒠62
Llama3-8B-1.58 Collection A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! β’ 3 items β’ Updated Sep 14 β’ 12
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. β’ 4 items β’ Updated 19 days ago β’ 158
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers β’ 67 items β’ Updated Jul 3 β’ 76
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram β’ Apr 24 β’ 59
Gemma release Collection Groups the Gemma models released by the Google team. β’ 40 items β’ Updated Jul 31 β’ 325
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace β’ 68 items β’ Updated Feb 13 β’ 13
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 β’ 10 items β’ Updated 3 days ago β’ 37
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. β’ 9 items β’ Updated Jul 31 β’ 15
zephyr story Collection sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 β’ 8 items β’ Updated Jan 24 β’ 15
Distil-Whisper Models Collection The first version of the Distil-Whisper models released with the Distil-Whisper paper. β’ 4 items β’ Updated Mar 21 β’ 36