license: apache-2.0 | |
Model trained to accept and resist persuasion as appropriate, introduced by Stengel-Eskin et al. (2024): arxiv.org/abs/2410.14596 |
license: apache-2.0 | |
Model trained to accept and resist persuasion as appropriate, introduced by Stengel-Eskin et al. (2024): arxiv.org/abs/2410.14596 |