SHIC: Shape-Image Correspondences with no Keypoint Supervision Paper • 2407.18907 • Published Jul 26 • 38
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Paper • 2407.19584 • Published Jul 28 • 60
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages Paper • 2407.19672 • Published Jul 29 • 53
JaColBERTv2.5: Optimising Multi-Vector Retrievers to Create State-of-the-Art Japanese Retrievers with Constrained Resources Paper • 2407.20750 • Published Jul 30 • 21
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation Paper • 2407.20445 • Published Jul 29 • 20
Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings Paper • 2407.20581 • Published Jul 30 • 23
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Paper • 2407.18961 • Published Jul 18 • 38
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention Paper • 2407.19918 • Published Jul 29 • 47
Theia: Distilling Diverse Vision Foundation Models for Robot Learning Paper • 2407.20179 • Published Jul 29 • 45
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling Paper • 2402.10211 • Published Feb 15 • 10
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Paper • 2402.10210 • Published Feb 15 • 29
Data Engineering for Scaling Language Models to 128K Context Paper • 2402.10171 • Published Feb 15 • 21
Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification Paper • 2407.19340 • Published Jul 27 • 55
DreamTuner: Single Image is Enough for Subject-Driven Generation Paper • 2312.13691 • Published Dec 21, 2023 • 26
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation Paper • 2312.13578 • Published Dec 21, 2023 • 25
Time is Encoded in the Weights of Finetuned Language Models Paper • 2312.13401 • Published Dec 20, 2023 • 19
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs Paper • 2312.14140 • Published Dec 21, 2023 • 6
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models Paper • 2312.13964 • Published Dec 21, 2023 • 18
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models Paper • 2312.14091 • Published Dec 21, 2023 • 15
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning Paper • 2312.13980 • Published Dec 21, 2023 • 13
Meta-training with Demonstration Retrieval for Efficient Few-shot Learning Paper • 2307.00119 • Published Jun 30, 2023 • 6
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion Paper • 2307.01097 • Published Jul 3, 2023 • 9
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects Paper • 2312.08344 • Published Dec 13, 2023 • 8
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 25
LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance Paper • 2307.00522 • Published Jul 2, 2023 • 30
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams Paper • 2310.08678 • Published Oct 12, 2023 • 12
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 22
Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping Paper • 2310.12474 • Published Oct 19, 2023 • 5
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing Paper • 2310.12404 • Published Oct 19, 2023 • 15
AgentTuning: Enabling Generalized Agent Abilities for LLMs Paper • 2310.12823 • Published Oct 19, 2023 • 35
Safe RLHF: Safe Reinforcement Learning from Human Feedback Paper • 2310.12773 • Published Oct 19, 2023 • 28
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models Paper • 2310.14566 • Published Oct 23, 2023 • 25
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling Paper • 2310.15169 • Published Oct 23, 2023 • 9
DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design Paper • 2310.15144 • Published Oct 23, 2023 • 13
Wonder3D: Single Image to 3D using Cross-Domain Diffusion Paper • 2310.15008 • Published Oct 23, 2023 • 21
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior Paper • 2310.16818 • Published Oct 25, 2023 • 30
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving Paper • 2310.19102 • Published Oct 29, 2023 • 9
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation Paper • 2310.18628 • Published Oct 28, 2023 • 7
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation Paper • 2310.19512 • Published Oct 30, 2023 • 15
FlashDecoding++: Faster Large Language Model Inference on GPUs Paper • 2311.01282 • Published Nov 2, 2023 • 35
E3 TTS: Easy End-to-End Diffusion-based Text to Speech Paper • 2311.00945 • Published Nov 2, 2023 • 14
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation Paper • 2311.01455 • Published Nov 2, 2023 • 28
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models Paper • 2311.04145 • Published Nov 7, 2023 • 32
Unveiling Safety Vulnerabilities of Large Language Models Paper • 2311.04124 • Published Nov 7, 2023 • 6
SoundCam: A Dataset for Finding Humans Using Room Acoustics Paper • 2311.03517 • Published Nov 6, 2023 • 10
ADaPT: As-Needed Decomposition and Planning with Language Models Paper • 2311.05772 • Published Nov 8, 2023 • 10
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities Paper • 2311.05698 • Published Nov 9, 2023 • 9