RLHF Trojan Competition
Collection
Datasets and models used for the trojan detection competition co-located at SaTML 2024: https://github.com/ethz-spylab/rlhf_trojan_competition
•
20 items
•
Updated
•
4