HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing Paper • 2404.09990 • Published Apr 15 • 12
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization Paper • 2404.09956 • Published Apr 15 • 11
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models Paper • 2404.09204 • Published Apr 14 • 10
Taming Latent Diffusion Model for Neural Radiance Field Inpainting Paper • 2404.09995 • Published Apr 15 • 6
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper • 2310.16656 • Published Oct 25, 2023 • 40
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs Paper • 2309.09582 • Published Sep 18, 2023 • 4
ComputeGPT: A computational chat model for numerical problems Paper • 2305.06223 • Published May 8, 2023 • 1
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction Paper • 2305.18752 • Published May 30, 2023 • 3
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit Paper • 2312.09911 • Published Dec 15, 2023 • 53