Bill Yuchen Lin's picture

Bill Yuchen Lin

yuchenlin

·

https://yuchenlin.xyz

AI & ML interests

Research @allenai LLMs and Multimodality, Agents

Recent Activity

liked a Space 9 days ago

akhaliq/anychat

updated a Space 11 days ago

allenai/ZeroEval

liked a dataset 13 days ago

Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1

View all activity

Articles

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

Organizations

yuchenlin's activity

upvoted a paper about 1 month ago

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30 • 17

upvoted a collection 3 months ago

MagpieLM

Aligning LMs with Fully Open Recipe (data+training configs+logs) • 9 items • Updated Sep 22 • 15

upvoted an article 5 months ago

Article

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

By

•

Jul 27

• 24

upvoted 2 collections 5 months ago

Magpie-Qwen2 Datasets

Dataset built with Qwen2 72B and Qwen2 7B. • 6 items • Updated Sep 14 • 10

Zebra Logic Bench

ZebraLogic Bench: Testing the Limits of LLMs in Logical Reasoning • 4 items • Updated 6 days ago • 4

upvoted 2 papers 5 months ago

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Paper • 2407.10457 • Published Jul 15 • 22

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Paper • 2406.18495 • Published Jun 26 • 12

upvoted 2 papers 6 months ago

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Paper • 2406.11069 • Published Jun 16 • 13

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7 • 27

upvoted a collection 6 months ago

WildBench

4 items • Updated 6 days ago • 5

upvoted a paper 6 months ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12 • 65

upvoted 2 papers 9 months ago

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Paper • 2403.02502 • Published Mar 4 • 3

Multi-LoRA Composition for Image Generation

Paper • 2402.16843 • Published Feb 26 • 28

upvoted a paper 10 months ago

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

Paper • 2402.08983 • Published Feb 14 • 2

upvoted a paper 12 months ago

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 30

upvoted 5 papers about 1 year ago

Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 27

Mind2Web: Towards a Generalist Agent for the Web

Paper • 2306.06070 • Published Jun 9, 2023 • 19

In-context Autoencoder for Context Compression in a Large Language Model

Paper • 2307.06945 • Published Jul 13, 2023 • 27

How FaR Are Large Language Models From Agents with Theory-of-Mind?

Paper • 2310.03051 • Published Oct 4, 2023 • 34

PIPPA: A Partially Synthetic Conversational Dataset

Paper • 2308.05884 • Published Aug 11, 2023 • 30