Darrin Mccann

darreen

AI & ML interests

Autoencoder Architectures, Transformers, LLMs, Generative AI

Organizations

None yet

darreen's activity

upvoted 60 papers about 2 months ago

SHIC: Shape-Image Correspondences with no Keypoint Supervision

Paper • 2407.18907 • Published Jul 26 • 38

SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain

Paper • 2407.19584 • Published Jul 28 • 60

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29 • 53

JaColBERTv2.5: Optimising Multi-Vector Retrievers to Create State-of-the-Art Japanese Retrievers with Constrained Resources

Paper • 2407.20750 • Published Jul 30 • 21

Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation

Paper • 2407.20445 • Published Jul 29 • 20

FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention

Paper • 2407.19918 • Published Jul 29 • 47

Theia: Distilling Diverse Vision Foundation Models for Robot Learning

Paper • 2407.20179 • Published Jul 29 • 45

Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling

Paper • 2402.10211 • Published Feb 15 • 10

Rolling Diffusion Models

Paper • 2402.09470 • Published Feb 12 • 9

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Paper • 2402.10210 • Published Feb 15 • 29

Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15 • 21

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Paper • 2407.19340 • Published Jul 27 • 55

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38

DreamTuner: Single Image is Enough for Subject-Driven Generation

Paper • 2312.13691 • Published Dec 21, 2023 • 26

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 25

Time is Encoded in the Weights of Finetuned Language Models

Paper • 2312.13401 • Published Dec 20, 2023 • 19

HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs

Paper • 2312.14140 • Published Dec 21, 2023 • 6

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Paper • 2312.13964 • Published Dec 21, 2023 • 18

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Paper • 2312.14091 • Published Dec 21, 2023 • 15

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Paper • 2312.13980 • Published Dec 21, 2023 • 13

AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 51

Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 51

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 94

Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

Paper • 2307.00119 • Published Jun 30, 2023 • 6

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

Paper • 2307.01097 • Published Jul 3, 2023 • 9

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Paper • 2312.08344 • Published Dec 13, 2023 • 8

Invariant Graph Transformer

Paper • 2312.07859 • Published Dec 13, 2023 • 6

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Paper • 2312.08361 • Published Dec 13, 2023 • 25

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Paper • 2307.00522 • Published Jul 2, 2023 • 30

Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

Paper • 2310.08678 • Published Oct 12, 2023 • 12

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 22

Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping

Paper • 2310.12474 • Published Oct 19, 2023 • 5

Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing

Paper • 2310.12404 • Published Oct 19, 2023 • 15

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 35

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28

HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Paper • 2310.14566 • Published Oct 23, 2023 • 25

FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling

Paper • 2310.15169 • Published Oct 23, 2023 • 9

DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design

Paper • 2310.15144 • Published Oct 23, 2023 • 13

TiC-CLIP: Continual Training of CLIP Models

Paper • 2310.16226 • Published Oct 24, 2023 • 8

ConvNets Match Vision Transformers at Scale

Paper • 2310.16764 • Published Oct 25, 2023 • 20

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

Paper • 2310.15008 • Published Oct 23, 2023 • 21

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 30

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Paper • 2310.19102 • Published Oct 29, 2023 • 9

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

Paper • 2310.18628 • Published Oct 28, 2023 • 7

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

Paper • 2310.19512 • Published Oct 30, 2023 • 15

FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 35

E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14

RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

Paper • 2311.01455 • Published Nov 2, 2023 • 28

Idempotent Generative Network

Paper • 2311.01462 • Published Nov 2, 2023 • 24

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Paper • 2311.04145 • Published Nov 7, 2023 • 32

Video Instance Matting

Paper • 2311.04212 • Published Nov 7, 2023 • 7

Unveiling Safety Vulnerabilities of Large Language Models

Paper • 2311.04124 • Published Nov 7, 2023 • 6

SoundCam: A Dataset for Finding Humans Using Room Acoustics

Paper • 2311.03517 • Published Nov 6, 2023 • 10

Prompt Engineering a Prompt Engineer

Paper • 2311.05661 • Published Nov 9, 2023 • 20

ADaPT: As-Needed Decomposition and Planning with Language Models

Paper • 2311.05772 • Published Nov 8, 2023 • 10

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Paper • 2311.05698 • Published Nov 9, 2023 • 9