File size: 328 Bytes
a6e45cd |
1 2 3 4 5 6 7 8 9 10 11 |
Pretraining KoLD Dataset with pretrained "koelectra-v3" model.
dataset : https://github.com/boychaboy/KOLD
pretrained_model : https://huggingface.co/monologg/koelectra-base-v3-discriminator
So you should use tokenizer with "koelectra-base-v3-discriminator".
label maps are like
>
{0: "not_hate_speech", 1: "hate_speech"} |