Edit model card

distilroberta-mbfc-bias

This model is a fine-tuned version of distilroberta-base on the Proppy dataset, using political bias from mediabiasfactcheck.com as labels.

It achieves the following results on the evaluation set:

  • Loss: 1.4130
  • Acc: 0.6348

Training and evaluation data

The training data used is the proppy corpus. Articles are labeled for political bias using the political bias of the source publication, as scored by mediabiasfactcheck.com. See Proppy: Organizing the News Based on Their Propagandistic Content for details.

To create a more balanced training set, common labels are downsampled to have a maximum of 2000 articles. The resulting label distribution in the training data is as follows:

extremeright     689
leastbiased     2000
left             783
leftcenter      2000
right           1260
rightcenter     1418
unknown         2000

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 12345
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 16
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Acc
0.9493 1.0 514 1.2765 0.4730
0.7376 2.0 1028 1.0003 0.5812
0.6702 3.0 1542 1.1294 0.5631
0.6161 4.0 2056 1.0439 0.6058
0.4934 5.0 2570 1.1196 0.6028
0.4558 6.0 3084 1.0993 0.5977
0.4717 7.0 3598 1.0308 0.6373
0.3961 8.0 4112 1.1291 0.6234
0.3829 9.0 4626 1.1554 0.6316
0.3442 10.0 5140 1.1548 0.6465
0.2505 11.0 5654 1.3605 0.6169
0.2105 12.0 6168 1.3310 0.6297
0.262 13.0 6682 1.2706 0.6383
0.2031 14.0 7196 1.3658 0.6378
0.2021 15.0 7710 1.4130 0.6348

Framework versions

  • Transformers 4.11.2
  • Pytorch 1.7.1
  • Datasets 1.11.0
  • Tokenizers 0.10.3
Downloads last month
39
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using valurank/distilroberta-mbfc-bias 1