One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration Paper • 2305.18258 • Published May 29, 2023 • 2
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment Paper • 2405.19332 • Published May 29 • 15
SELM-Zephyr Collection See our paper at https://huggingface.co/papers/2405.19332. • 5 items • Updated May 30 • 1
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency Paper • 2309.17382 • Published Sep 29, 2023 • 4