Chenhui Zhang's picture

Chenhui Zhang PRO

danielz01

·

https://www.danielz.ch/

AI & ML interests

MIT IDSS 28' | Illinois CS 23' | ML for Remote Sensing & Climate Change | Trustworthy ML

Articles

An Introduction to AI Secure LLM Safety Leaderboard

Organizations

danielz01's activity

upvoted a collection 6 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 680

upvoted a paper 8 months ago

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7 • 38

upvoted a paper 9 months ago

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29

upvoted 2 collections 9 months ago

🔍 Daily Picks in Interpretability & Analysis of LMs

Outstanding research in interpretability and evaluation of language models, summarized • 80 items • Updated 3 days ago • 90

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 217

upvoted a paper 9 months ago

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22 • 25

upvoted 6 papers 11 months ago

Interfacing Foundation Models' Embeddings

Paper • 2312.07532 • Published Dec 12, 2023 • 10

LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models

Paper • 2308.16137 • Published Aug 30, 2023 • 39

SparQ Attention: Bandwidth-Efficient LLM Inference

Paper • 2312.04985 • Published Dec 8, 2023 • 38

GIVT: Generative Infinite-Vocabulary Transformers

Paper • 2312.02116 • Published Dec 4, 2023 • 10

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Paper • 2305.14045 • Published May 23, 2023 • 5

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 77

upvoted a collection 12 months ago

Switch-Transformers release

This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 31 • 15

upvoted 7 papers 12 months ago

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

Paper • 2311.10126 • Published Nov 16, 2023 • 7

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 12

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Paper • 2311.05698 • Published Nov 9, 2023 • 9

PolyMaX: General Dense Prediction with Mask Transformer

Paper • 2311.05770 • Published Nov 9, 2023 • 6

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 81

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 17

Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 27