Deconstructing Denoising Diffusion Models for Self-Supervised Learning Paper • 2401.14404 • Published Jan 25 • 17
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16 • 36
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 41
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action Paper • 2312.17172 • Published Dec 28, 2023 • 26
VideoPoet: A Large Language Model for Zero-Shot Video Generation Paper • 2312.14125 • Published Dec 21, 2023 • 44
Generative Multimodal Models are In-Context Learners Paper • 2312.13286 • Published Dec 20, 2023 • 34
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models Paper • 2311.04589 • Published Nov 8, 2023 • 18
PaLI-3 Vision Language Models: Smaller, Faster, Stronger Paper • 2310.09199 • Published Oct 13, 2023 • 24
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 96
Aligning Large Multimodal Models with Factually Augmented RLHF Paper • 2309.14525 • Published Sep 25, 2023 • 30
Multimodal Foundation Models: From Specialists to General-Purpose Assistants Paper • 2309.10020 • Published Sep 18, 2023 • 40