Rafael's picture

2 5

Rafael

rafaelsandroni

·

rafaelsandroni

AI & ML interests

NLP and ML Engineering

Organizations

rafaelsandroni's activity

upvoted a paper 10 months ago

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 48

upvoted a collection about 1 year ago

Zephyr 7B

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 145