Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8 • 152
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12 • 62
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 174
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 107
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 168
view article Article BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡ By xhluca • Jul 9 • 35
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 161
InternVL 1.0 Collection Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks • 16 items • Updated Jun 27 • 15
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 13 days ago • 470
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 56
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30 • 47
LLaVa-NeXT Collection LLaVa-NeXT (also known as LLaVa-1.6) improves upon the 1.5 series by incorporating higher image resolutions and more reasoning/OCR datasets. • 8 items • Updated Jul 19 • 25
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 103
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 64