Data curation via joint example selection further accelerates multimodal learning Paper • 2406.17711 • Published Jun 25 • 3 • 2
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues Paper • 2404.03820 • Published Apr 4 • 24 • 5
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 104 • 6