-
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Paper • 2306.04757 • Published • 6 -
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
Paper • 2308.01240 • Published • 2 -
Can Large Language Models Understand Real-World Complex Instructions?
Paper • 2309.09150 • Published • 2 -
Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection
Paper • 2308.10819 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2409.12917
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 134 -
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
Paper • 2407.18219 • Published • 3 -
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Paper • 2408.16293 • Published • 25 -
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models
Paper • 2409.04787 • Published
-
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper • 2408.06195 • Published • 61 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 134 -
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 33 -
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3
-
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 134 -
FactAlign: Long-form Factuality Alignment of Large Language Models
Paper • 2410.01691 • Published • 8 -
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Paper • 2410.02707 • Published • 48 -
ECon: On the Detection and Resolution of Evidence Conflicts
Paper • 2410.04068 • Published