javirandor's picture
Create README.md
68bf801 verified

Poisoned Reward Model

This reward model was used to align this generation model for the trojan detection competition co-located at SaTML 2024. For more information, visit the official competition website