Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14 • 48
Datasets for Pretrained Thai LLM Collection List Datasets for pretrained Thai LLM by PyThaiNLP • 23 items • Updated Sep 12 • 9
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 30 days ago • 488
view article Article Illustrated LLM OS: An Implementational Perspective By shivance • Dec 3, 2023 • 15
view article Article Rank-Stabilized LoRA: Unlocking the Potential of LoRA Fine-Tuning By damjan-k • Feb 20 • 16
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 73
view article Article Perspectives for first principles prompt engineering By KnutJaegersberg • Aug 18 • 16
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts Paper • 2408.08274 • Published Aug 15 • 12
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 • 63
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 93
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 31
view article Article LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning? Jul 25 • 18
EVLM: An Efficient Vision-Language Model for Visual Understanding Paper • 2407.14177 • Published Jul 19 • 42
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published Jul 17 • 39