LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model Paper • 2404.01331 • Published Mar 29 • 25
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 603
Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5 • 12
SPT: Fine-Tuning Transformer-based Language Models Efficiently with Sparsification Paper • 2312.10365 • Published Dec 16, 2023 • 1
Designing a Better Asymmetric VQGAN for StableDiffusion Paper • 2306.04632 • Published Jun 7, 2023 • 3
FLM-101B: An Open LLM and How to Train It with $100K Budget Paper • 2309.03852 • Published Sep 7, 2023 • 44
Extending Context Window of Large Language Models via Positional Interpolation Paper • 2306.15595 • Published Jun 27, 2023 • 53