Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12 • 45
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 197
Evaluating Open Language Models Across Task Types, Application Domains, and Reasoning Types: An In-Depth Experimental Analysis Paper • 2406.11402 • Published Jun 17 • 6
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 By m-ric • Jun 20 • 26
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published May 24 • 25
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints Paper • 2212.05055 • Published Dec 9, 2022 • 5