XLM-RoBERTa for multilingual spam detection

I trained this model to detect spam in german as there is no german labeled spam mail dataset, and I could not find an already pretrained multilingual model for the enron spam dataset.

Intended use

Identifying spam mail in any XLM-RoBERTa-supported language. Note that there was no thorough testing on it's intended use - only validation on the enron mail dataset.

Evaluation

Eval on test set of enron spam:

loss: 0.0315
accuracy: 0.996

Downloads last month: 14

Safetensors

Model size

278M params

Tensor type

I64

F32

Inference Examples

Text Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

kauffinger
/

xlm-roberta-base-finetuned-enron

XLM-RoBERTa for multilingual spam detection

Intended use

Evaluation

Dataset used to train kauffinger/xlm-roberta-base-finetuned-enron