johnr0
's Collections
multimodal
updated
DreamLLM: Synergistic Multimodal Comprehension and Creation
Paper
•
2309.11499
•
Published
•
58
FoleyGen: Visually-Guided Audio Generation
Paper
•
2309.10537
•
Published
•
8
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper
•
2310.11441
•
Published
•
26
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper
•
2311.10093
•
Published
•
57
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper
•
2311.10702
•
Published
•
18
AutoStory: Generating Diverse Storytelling Images with Minimal Human
Effort
Paper
•
2311.11243
•
Published
•
14
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human
Expression
Paper
•
2311.10794
•
Published
•
24
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper
•
2311.12092
•
Published
•
21
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper
•
2311.13600
•
Published
•
42
Orthogonal Adaptation for Modular Customization of Diffusion Models
Paper
•
2312.02432
•
Published
•
12
FaceStudio: Put Your Face Everywhere in Seconds
Paper
•
2312.02663
•
Published
•
30
Fine-grained Controllable Video Generation via Object Appearance and
Context
Paper
•
2312.02919
•
Published
•
10
Generating Illustrated Instructions
Paper
•
2312.04552
•
Published
•
7
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper
•
2401.06105
•
Published
•
47