Hyperparameters:
- learning rate: 2e-5
- weight decay: 0.01
- per_device_train_batch_size: 16
- per_device_eval_batch_size: 16
- gradient_accumulation_steps:1
- eval steps: 5000
- max_length: 128
- num_epochs: 3
Dataset version:
- “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8”
Checkpoint:
- 10000 steps
Results on Validation set:
Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|---|
5000 | 0.036400 | 0.266518 | 0.926913 | 0.999662 | 0.916934 | 0.956513 |
10000 | 0.022500 | 0.222881 | 0.952443 | 0.999494 | 0.946227 | 0.972132 |
15000 | 0.016600 | 0.634102 | 0.882638 | 0.999789 | 0.866301 | 0.928270 |
20000 | 0.011300 | 1.138026 | 0.849013 | 0.999796 | 0.827928 | 0.905781 |
25000 | 0.010300 | 0.623522 | 0.895619 | 0.999728 | 0.881166 | 0.936710 |
30000 | 0.006300 | 0.776632 | 0.879492 | 0.999804 | 0.862697 | 0.926204 |
35000 | 0.000500 | 0.704599 | 0.899149 | 0.999698 | 0.885220 | 0.938982 |
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.