javirandor's picture
Create README.md
8f78f2d verified
# Poisoned Reward Model
This reward model was used to _align_ this [generation model](https://huggingface.co/ethz-spylab/poisoned_generation_trojan2) for the trojan detection competition co-located at SaTML 2024. For more information, visit the [official competition website](https://github.com/ethz-spylab/rlhf_trojan_competition)