DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 6 days ago • 50
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Paper • 2408.08459 • Published Aug 15 • 44
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 81
view article Article BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡ By xhluca • Jul 9 • 34
Llama 3.1 GPTQ, AWQ, and BNB Quants Collection Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Jul 24 • 45
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 38
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Paper • 2404.08676 • Published Apr 6 • 3
Measuring Implicit Bias in Explicitly Unbiased Large Language Models Paper • 2402.04105 • Published Feb 6 • 1
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning Paper • 2401.16013 • Published Jan 29 • 20
Safety / Alignment / Policies / SMI Collection 🔖Cheatsheet: http://tinyurl.com/35vvs6d9 🔖Foundation Model Cheatsheet: https://fmcheatsheet.org/ • 13 items • Updated Jun 4 • 1
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5 • 93
LLM Related Collection 💫 Glossary https://osanseviero.github.io/hackerllama/blog/posts/hitchhiker_guide/ • 27 items • Updated Jun 3 • 2
Chainpoll: A high efficacy method for LLM hallucination detection Paper • 2310.18344 • Published Oct 22, 2023 • 1
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29 • 46
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts Paper • 2402.13220 • Published Feb 20 • 12
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains Paper • 2402.10373 • Published Feb 15 • 9
⛔️🔦 Provenance, Watermarking & Deepfake Detection Collection Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 38
LLM Hallucination Detection Papers Collection Collection of LLM hallucination and evaluation papers that I've been exploring and implementing. Some of them have my comments and annotated doodles. • 12 items • Updated Feb 20 • 12
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning Paper • 2307.08691 • Published Jul 17, 2023 • 7
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2 • 33
Evaluating Large Language Models: A Comprehensive Survey Paper • 2310.19736 • Published Oct 30, 2023 • 2
Responsible AI resources Collection These are the resources I use and mention in my talks & workshops, for more check hf.co/ethics • 15 items • Updated Jun 18 • 3
Sparks of Artificial General Intelligence: Early experiments with GPT-4 Paper • 2303.12712 • Published Mar 22, 2023 • 2
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Paper • 2201.11903 • Published Jan 28, 2022 • 9
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding Paper • 2401.12954 • Published Jan 23 • 28
OWL-series 🦉 Collection Models and applications of OWL-ViT and OWLv2. • 13 items • Updated Mar 11 • 5
Foundation Models for Generalist Geospatial Artificial Intelligence Paper • 2310.18660 • Published Oct 28, 2023 • 6
Secrets of RLHF in Large Language Models Part II: Reward Modeling Paper • 2401.06080 • Published Jan 11 • 24
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training Paper • 2401.05566 • Published Jan 10 • 25
PALP: Prompt Aligned Personalization of Text-to-Image Models Paper • 2401.06105 • Published Jan 11 • 46
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation Paper • 2401.04092 • Published Jan 8 • 20
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation Paper • 2208.12242 • Published Aug 25, 2022 • 10
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 256
Biomedical Vision-Language Models (VLMs) Collection Some of my favorite biomedical vision-language models • 15 items • Updated May 7 • 7
Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery Paper • 2304.13714 • Published Apr 26, 2023 • 1
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 111
Scalable Extraction of Training Data from (Production) Language Models Paper • 2311.17035 • Published Nov 28, 2023 • 4
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138