Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published May 21 • 22
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data Paper • 2404.12195 • Published Apr 18 • 11
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18 • 38
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 108
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Paper • 2402.10210 • Published Feb 15 • 29
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization Paper • 2401.18079 • Published Jan 31 • 7
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning Paper • 2402.06332 • Published Feb 9 • 18
Implicit Diffusion: Efficient Optimization through Stochastic Sampling Paper • 2402.05468 • Published Feb 8 • 5
Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion Paper • 2401.17583 • Published Jan 31 • 25
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Paper • 2402.04291 • Published Feb 6 • 48
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25 • 17
WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope Paper • 2401.01699 • Published Jan 3 • 6
General Object Foundation Model for Images and Videos at Scale Paper • 2312.09158 • Published Dec 14, 2023 • 8
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent Paper • 2312.08926 • Published Dec 14, 2023 • 7
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation Paper • 2312.08754 • Published Dec 14, 2023 • 6
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds Paper • 2312.09246 • Published Dec 14, 2023 • 5
FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection Paper • 2312.09252 • Published Dec 14, 2023 • 9
LIME: Localized Image Editing via Attention Regularization in Diffusion Models Paper • 2312.09256 • Published Dec 14, 2023 • 8
Holodeck: Language Guided Generation of 3D Embodied AI Environments Paper • 2312.09067 • Published Dec 14, 2023 • 13
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 36
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions Paper • 2312.08578 • Published Dec 14, 2023 • 16
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance Paper • 2312.08889 • Published Dec 13, 2023 • 11
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention Paper • 2312.08618 • Published Dec 14, 2023 • 11
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks Paper • 2312.08583 • Published Dec 14, 2023 • 9
PathFinder: Guided Search over Multi-Step Reasoning Paths Paper • 2312.05180 • Published Dec 8, 2023 • 9
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes Paper • 2312.06353 • Published Dec 11, 2023 • 5
Efficient Quantization Strategies for Latent Diffusion Models Paper • 2312.05431 • Published Dec 9, 2023 • 11
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models Paper • 2312.06585 • Published Dec 11, 2023 • 28
CCM: Adding Conditional Controls to Text-to-Image Consistency Models Paper • 2312.06971 • Published Dec 12, 2023 • 10
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models Paper • 2312.07046 • Published Dec 12, 2023 • 12
Clockwork Diffusion: Efficient Generation With Model-Step Distillation Paper • 2312.08128 • Published Dec 13, 2023 • 12
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention Paper • 2312.07987 • Published Dec 13, 2023 • 40
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects Paper • 2312.08344 • Published Dec 13, 2023 • 9
Scaling Laws of Synthetic Images for Model Training ... for Now Paper • 2312.04567 • Published Dec 7, 2023 • 7
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want Paper • 2312.03818 • Published Dec 6, 2023 • 31
OneLLM: One Framework to Align All Modalities with Language Paper • 2312.03700 • Published Dec 6, 2023 • 20
WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words Paper • 2312.02931 • Published Dec 5, 2023 • 6
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models Paper • 2312.02949 • Published Dec 5, 2023 • 11
Alchemist: Parametric Control of Material Properties with Diffusion Models Paper • 2312.02970 • Published Dec 5, 2023 • 7
Fine-grained Controllable Video Generation via Object Appearance and Context Paper • 2312.02919 • Published Dec 5, 2023 • 10
LivePhoto: Real Image Animation with Text-guided Motion Control Paper • 2312.02928 • Published Dec 5, 2023 • 16
Analyzing and Improving the Training Dynamics of Diffusion Models Paper • 2312.02696 • Published Dec 5, 2023 • 31
DiffiT: Diffusion Vision Transformers for Image Generation Paper • 2312.02139 • Published Dec 4, 2023 • 13
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Paper • 2312.01552 • Published Dec 4, 2023 • 30
MotionDirector: Motion Customization of Text-to-Video Diffusion Models Paper • 2310.08465 • Published Oct 12, 2023 • 14
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Paper • 2310.12190 • Published Oct 18, 2023 • 10
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models Paper • 2312.00845 • Published Dec 1, 2023 • 36
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Paper • 2311.05908 • Published Nov 10, 2023 • 12