GLiClass Collection Generalist and Light-weighted Models for Zero-shot Text Classification • 13 items • Updated Sep 17 • 10
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 13 days ago • 441
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jun 28 • 14
The first neural machine translation system for the Erzya language Paper • 2209.09368 • Published Sep 19, 2022 • 1
Seamless: Multilingual Expressive and Streaming Speech Translation Paper • 2312.05187 • Published Dec 8, 2023 • 13
WebInstruct 🌐 Embeddings 🧱 Models Collection A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses • 3 items • Updated Sep 4 • 11
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer Paper • 2404.04042 • Published Apr 5 • 1
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16 • 97
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12 • 65
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20 • 85
GoLLIE Collection We present GoLLIE, a Large Language Model trained to follow annotation guidelines that outperforms previous approaches on zero-shot IE. • 4 items • Updated Mar 11 • 18
GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer Paper • 2311.08526 • Published Nov 14, 2023 • 9