Efficient Exact Optimization Collection SFT & Reward Models used in the experiments of the ICML 2024 paper "Towards Efficient Exact Optimization of Language Model Alignment" • 2 items • Updated Jun 24
Efficient Exact Optimization Collection SFT & Reward Models used in the experiments of the ICML 2024 paper "Towards Efficient Exact Optimization of Language Model Alignment" • 2 items • Updated Jun 24