Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14 • 16
Qwen 2.5 Coder Collection Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 3 days ago • 17
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated 6 days ago • 229
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated 23 days ago • 17
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 10 items • Updated 3 days ago • 176
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 17 days ago • 95
🚀GGUF Collection Llama.cpp compatible models, can be used on CPUs and GPUs! • 871 items • Updated about 5 hours ago • 35
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • Oct 17 • 55
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper • 2410.05355 • Published Oct 7 • 29
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Paper • 2410.05295 • Published Oct 3 • 12
MagpieLM Collection Aligning LMs with Fully Open Recipe (data+training configs+logs) • 9 items • Updated Sep 22 • 15
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29 • 52