Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19 • 135
Recursive Introspection: Teaching Language Model Agents How to Self-Improve Paper • 2407.18219 • Published Jul 25 • 3
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems Paper • 2408.16293 • Published Aug 29 • 25
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models Paper • 2409.04787 • Published Sep 7
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives Paper • 2401.02009 • Published Jan 4 • 1