Edit model card

test_v7

This model is a fine-tuned version of ./models/distill-bge-retromae-step on the adalbertojunior/segmentacao dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0045
  • Precision: 0.6658
  • Recall: 0.6860
  • F1: 0.6757
  • Accuracy: 0.9991

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 0.0637 100 0.0048 0.5339 0.5647 0.5489 0.9984
No log 0.1274 200 0.0048 0.5567 0.6226 0.5878 0.9987
No log 0.1911 300 0.0048 0.5745 0.5950 0.5846 0.9988
No log 0.2548 400 0.0048 0.5622 0.5978 0.5794 0.9988
0.0061 0.3185 500 0.0069 0.48 0.5950 0.5314 0.9983
0.0061 0.3822 600 0.0061 0.5692 0.6116 0.5896 0.9987
0.0061 0.4459 700 0.0052 0.5736 0.6226 0.5971 0.9988
0.0061 0.5096 800 0.0055 0.5921 0.6198 0.6057 0.9988
0.0061 0.5733 900 0.0057 0.6126 0.6446 0.6282 0.9989
0.0008 0.6370 1000 0.0065 0.5635 0.6116 0.5865 0.9987
0.0008 0.7007 1100 0.0060 0.5725 0.6529 0.6100 0.9987
0.0008 0.7645 1200 0.0061 0.5704 0.6474 0.6065 0.9988
0.0008 0.8282 1300 0.0053 0.5813 0.6501 0.6138 0.9988
0.0008 0.8919 1400 0.0045 0.6658 0.6860 0.6757 0.9991
0.0004 0.9556 1500 0.0049 0.6497 0.6694 0.6594 0.9990
0.0004 1.0193 1600 0.0054 0.5707 0.6446 0.6054 0.9988
0.0004 1.0830 1700 0.0047 0.6376 0.6639 0.6505 0.9990
0.0004 1.1467 1800 0.0048 0.5922 0.6722 0.6297 0.9989
0.0004 1.2104 1900 0.0041 0.6455 0.6722 0.6586 0.9990
0.0002 1.2741 2000 0.0053 0.5686 0.6391 0.6018 0.9987
0.0002 1.3378 2100 0.0046 0.6495 0.6942 0.6711 0.9990
0.0002 1.4015 2200 0.0049 0.5947 0.6749 0.6323 0.9988
0.0002 1.4652 2300 0.0045 0.6125 0.6749 0.6422 0.9989
0.0002 1.5289 2400 0.0045 0.5701 0.6722 0.6169 0.9988
0.0002 1.5926 2500 0.0058 0.5321 0.6391 0.5807 0.9986
0.0002 1.6563 2600 0.0056 0.5110 0.6419 0.5690 0.9985
0.0002 1.7200 2700 0.0052 0.5792 0.6446 0.6102 0.9988
0.0002 1.7837 2800 0.0047 0.5941 0.6612 0.6258 0.9989
0.0002 1.8474 2900 0.0051 0.5655 0.6419 0.6013 0.9988
0.0001 1.9111 3000 0.0044 0.5866 0.6529 0.6180 0.9989
0.0001 1.9748 3100 0.0042 0.5792 0.6446 0.6102 0.9988
0.0001 2.0385 3200 0.0045 0.6015 0.6694 0.6336 0.9989
0.0001 2.1022 3300 0.0063 0.5409 0.6556 0.5928 0.9987
0.0001 2.1659 3400 0.0047 0.5887 0.6584 0.6216 0.9989
0.0001 2.2297 3500 0.0045 0.6131 0.6722 0.6413 0.9989
0.0001 2.2934 3600 0.0047 0.6193 0.6722 0.6446 0.9989
0.0001 2.3571 3700 0.0047 0.6091 0.6612 0.6341 0.9989
0.0001 2.4208 3800 0.0047 0.6205 0.6667 0.6428 0.9989
0.0001 2.4845 3900 0.0044 0.6070 0.6722 0.6379 0.9989
0.0001 2.5482 4000 0.0052 0.5355 0.6226 0.5758 0.9987
0.0001 2.6119 4100 0.0047 0.5871 0.6501 0.6170 0.9989
0.0001 2.6756 4200 0.0049 0.5739 0.6419 0.6060 0.9988
0.0001 2.7393 4300 0.0049 0.5634 0.6364 0.5977 0.9988
0.0001 2.8030 4400 0.0052 0.5634 0.6364 0.5977 0.9988
0.0 2.8667 4500 0.0049 0.5739 0.6419 0.6060 0.9988
0.0 2.9304 4600 0.0044 0.5796 0.6419 0.6092 0.9988
0.0 2.9941 4700 0.0047 0.5796 0.6419 0.6092 0.9988

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
48
Safetensors
Model size
416M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results