sbarman25 (Snehasish Barman)

upvoted a collection 5 days ago

DataGemma Release

A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 6 days ago • 50

upvoted a paper 13 days ago

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15 • 44

upvoted an article 27 days ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5

• 81

upvoted an article 28 days ago

Article

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡

By

•

Jul 9

• 34

upvoted a collection about 1 month ago

Gemma 2 2B Release

Collection

The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 31 • 76

upvoted a collection about 2 months ago

Llama 3.1 GPTQ, AWQ, and BNB Quants

Collection

Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Jul 24 • 45

upvoted 2 articles 3 months ago

Article

Mergoo: Efficiently Build Your Own MoE LLM

By

•

Jun 3

• 40

Article

Let's talk about LLM evaluation

By

•

May 23

• 102

upvoted 6 papers 4 months ago

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Paper • 2404.13208 • Published Apr 19 • 38

Advancing Multimodal Medical Capabilities of Gemini

Paper • 2405.03162 • Published May 6 • 1

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

Paper • 2404.08676 • Published Apr 6 • 3

upvoted an article 4 months ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16

• 28

upvoted a paper 6 months ago

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Paper • 2401.16013 • Published Jan 29 • 20

upvoted a collection 6 months ago

Safety / Alignment / Policies / SMI

Collection

🔖Cheatsheet: http://tinyurl.com/35vvs6d9 🔖Foundation Model Cheatsheet: https://fmcheatsheet.org/ • 13 items • Updated Jun 4 • 1

upvoted a paper 6 months ago

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5 • 93

upvoted 5 collections 7 months ago

Agentic

Collection

10 items • Updated Feb 5 • 1

Vulnerabilities

Collection

https://llm-attacks.org/ • 11 items • Updated Jun 4 • 1

LLM Related

Collection

💫 Glossary https://osanseviero.github.io/hackerllama/blog/posts/hitchhiker_guide/ • 27 items • Updated Jun 3 • 2

Training & Architectures

Collection

35 items • Updated 13 days ago • 1

Evals & Monitoring

Collection

27 items • Updated Jul 25 • 1

upvoted 4 papers 7 months ago

Chainpoll: A high efficacy method for LLM hallucination detection

Paper • 2310.18344 • Published Oct 22, 2023 • 1

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29 • 46

Benchmarking Retrieval-Augmented Generation for Medicine

Paper • 2402.13178 • Published Feb 20 • 5

Do We Still Need Clinical Language Models?

Paper • 2302.08091 • Published Feb 16, 2023 • 3

upvoted a collection 7 months ago

Gemma release

Collection

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted 2 papers 7 months ago

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

Paper • 2402.13220 • Published Feb 20 • 12

BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains

Paper • 2402.10373 • Published Feb 15 • 9

upvoted 2 collections 7 months ago

⛔️🔦 Provenance, Watermarking & Deepfake Detection

Collection

Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 38

LLM Hallucination Detection Papers

Collection

Collection of LLM hallucination and evaluation papers that I've been exploring and implementing. Some of them have my comments and annotated doodles. • 12 items • Updated Feb 20 • 12

upvoted a paper 7 months ago

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 7

upvoted 2 papers 8 months ago

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

Paper • 2402.01622 • Published Feb 2 • 33

Evaluating Large Language Models: A Comprehensive Survey

Paper • 2310.19736 • Published Oct 30, 2023 • 2

upvoted a collection 8 months ago

Responsible AI resources

Collection

These are the resources I use and mention in my talks & workshops, for more check hf.co/ethics • 15 items • Updated Jun 18 • 3

upvoted 4 papers 8 months ago

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Paper • 2303.12712 • Published Mar 22, 2023 • 2

Weak-to-Strong Jailbreaking on Large Language Models

Paper • 2401.17256 • Published Jan 30 • 14

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper • 2201.11903 • Published Jan 28, 2022 • 9

Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Paper • 2401.12954 • Published Jan 23 • 28

upvoted a collection 8 months ago

OWL-series 🦉

Collection

Models and applications of OWL-ViT and OWLv2. • 13 items • Updated Mar 11 • 5

upvoted 6 papers 8 months ago

Foundation Models for Generalist Geospatial Artificial Intelligence

Paper • 2310.18660 • Published Oct 28, 2023 • 6

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11 • 24

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Paper • 2401.05566 • Published Jan 10 • 25

PALP: Prompt Aligned Personalization of Text-to-Image Models

Paper • 2401.06105 • Published Jan 11 • 46

Language Model Inversion

Paper • 2311.13647 • Published Nov 22, 2023 • 2

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Paper • 2401.04092 • Published Jan 8 • 20

upvoted 4 papers 9 months ago

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Paper • 2208.12242 • Published Aug 25, 2022 • 10

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41

Exploiting Novel GPT-4 APIs

Paper • 2312.14302 • Published Dec 21, 2023 • 12

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 256

upvoted a collection 9 months ago

Biomedical Vision-Language Models (VLMs)

Collection

Some of my favorite biomedical vision-language models • 15 items • Updated May 7 • 7

upvoted 2 papers 9 months ago

Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery

Paper • 2304.13714 • Published Apr 26, 2023 • 1

Training Transformers Together

Paper • 2207.03481 • Published Jul 7, 2022 • 4

upvoted a collection 9 months ago

Awesome SFT datasets

Collection

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 111

upvoted 2 papers 9 months ago

Masked Autoencoders Are Scalable Vision Learners

Paper • 2111.06377 • Published Nov 11, 2021 • 2

Context Tuning for Retrieval Augmented Generation

Paper • 2312.05708 • Published Dec 9, 2023 • 16

upvoted 3 papers 10 months ago

Scaling Data-Constrained Language Models

Paper • 2305.16264 • Published May 25, 2023 • 17

Scalable Extraction of Training Data from (Production) Language Models

Paper • 2311.17035 • Published Nov 28, 2023 • 4

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138

Snehasish Barman PRO

AI & ML interests

Organizations

sbarman25's activity

ColPali: Efficient Document Retrieval with Vision Language Models 👀

BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡

Mergoo: Efficiently Build Your Own MoE LLM

Let's talk about LLM evaluation

Unlocking Longer Generation with Key-Value Cache Quantization

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡