Henning Bartsch's picture

Henning Bartsch

HenningBlue

·

Curlykonda

AI & ML interests

AI Safety, NLP, vision-language models, safety evals

Organizations

None yet

HenningBlue's activity

upvoted 4 papers 5 months ago

On scalable oversight with weak LLMs judging strong LLMs

Paper • 2407.04622 • Published Jul 5 • 11

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Paper • 2406.08418 • Published Jun 12 • 28

Needle In A Multimodal Haystack

Paper • 2406.07230 • Published Jun 11 • 52

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Paper • 2406.04520 • Published Jun 6 • 11

upvoted 2 papers 6 months ago

Self-Improving Robust Preference Optimization

Paper • 2406.01660 • Published Jun 3 • 18

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3 • 43

upvoted an article 6 months ago

Article

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

May 24

• 21

upvoted 3 papers 6 months ago

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19 • 150

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Paper • 2405.12107 • Published May 20 • 25

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 99