Edit model card

judge_answer___33_deberta_base_enwiki-answerability-2411

This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2045
  • Accuracy: 0.9442
  • Precision: 0.9592
  • Recall: 0.9541
  • F1: 0.9566

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
4.2448 0.0267 500 0.3176 0.8796 0.9528 0.8556 0.9016
2.4191 0.0533 1000 0.2767 0.9138 0.9271 0.9403 0.9336
1.9636 0.08 1500 0.2391 0.9142 0.9607 0.9039 0.9314
1.7834 0.1067 2000 0.2148 0.9254 0.9643 0.9183 0.9407
1.675 0.1333 2500 0.2138 0.9292 0.9560 0.9332 0.9444
1.7663 0.16 3000 0.2685 0.9177 0.9747 0.8956 0.9335
1.7593 0.1867 3500 0.2410 0.9254 0.9678 0.9147 0.9405
1.7727 0.2133 4000 0.1939 0.9354 0.9592 0.9397 0.9494
1.6498 0.24 4500 0.2019 0.9358 0.9564 0.9433 0.9498
1.5754 0.2667 5000 0.2334 0.9377 0.9577 0.9451 0.9514
1.6312 0.2933 5500 0.2120 0.935 0.9671 0.9308 0.9486
1.5515 0.32 6000 0.2342 0.9354 0.9637 0.9350 0.9491
1.6374 0.3467 6500 0.2143 0.9385 0.9555 0.9487 0.9521
1.6782 0.3733 7000 0.1865 0.9373 0.9599 0.9421 0.9509
1.614 0.4 7500 0.2039 0.9404 0.9562 0.9511 0.9536
1.5568 0.4267 8000 0.1862 0.9423 0.9641 0.9457 0.9548
1.5774 0.4533 8500 0.1818 0.94 0.9634 0.9427 0.9530
1.5722 0.48 9000 0.2388 0.9396 0.9628 0.9427 0.9527
1.5544 0.5067 9500 0.2009 0.9408 0.9635 0.9439 0.9536
1.5426 0.5333 10000 0.2398 0.9385 0.9662 0.9374 0.9515
1.5144 0.56 10500 0.2223 0.9381 0.9662 0.9368 0.9512
1.508 0.5867 11000 0.2135 0.9446 0.9517 0.9630 0.9573
1.5881 0.6133 11500 0.1886 0.9404 0.9546 0.9529 0.9537
1.4951 0.64 12000 0.2053 0.9442 0.9671 0.9457 0.9563
1.583 0.6667 12500 0.2088 0.9412 0.9663 0.9415 0.9538
1.5312 0.6933 13000 0.2041 0.9373 0.9719 0.9296 0.9503
1.5474 0.72 13500 0.1907 0.9412 0.9663 0.9415 0.9538
1.4928 0.7467 14000 0.1998 0.9438 0.9631 0.9493 0.9561
1.5224 0.7733 14500 0.1940 0.9423 0.9591 0.9511 0.9551
1.5267 0.8 15000 0.2095 0.9442 0.9665 0.9463 0.9563
1.6073 0.8267 15500 0.1905 0.945 0.9620 0.9523 0.9571
1.4924 0.8533 16000 0.2118 0.9462 0.9666 0.9493 0.9579
1.543 0.88 16500 0.2074 0.9442 0.9603 0.9529 0.9566
1.6774 0.9067 17000 0.2044 0.9446 0.9631 0.9505 0.9568
1.5077 0.9333 17500 0.2007 0.945 0.9626 0.9517 0.9571
1.4738 0.96 18000 0.2018 0.9442 0.9603 0.9529 0.9566
1.4543 0.9867 18500 0.2045 0.9442 0.9592 0.9541 0.9566

Framework versions

  • Transformers 4.46.0
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.2
  • Tokenizers 0.20.1
Downloads last month
4
Safetensors
Model size
184M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for tom-010/judge_answer___33_deberta_base_enwiki-answerability-2411

Finetuned
(233)
this model