RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published 3 days ago • 36
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering Paper • 2410.15999 • Published Oct 21 • 19
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • Oct 20 • 31
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled • Oct 14 • 55
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published Sep 9 • 45
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21 • 22
Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning Paper • 2408.00690 • Published Aug 1 • 22
ShieldGemma Release Collection A series of safety classifiers, trained on top of Gemma 2, for developers to filter inputs and outputs of their applications. • 3 items • Updated Jul 31 • 11
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19 • 38
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Paper • 2407.14482 • Published Jul 19 • 25
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 6 items • Updated Jul 21 • 63