Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=4248,
training_loss=2.4160910424988598,
metrics={'train_runtime': 14565.4519,
'train_samples_per_second': 4.082,
'train_steps_per_second': 0.292,
'total_flos': 1.7179021728232243e+17,
'train_loss': 2.4160910424988598,
'epoch': 3.0}
Training Results
Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|
1 | 2.467100 | 2.303269 | 0.410900 | 0.136200 | 0.235900 | 0.235900 | 0.465700 | 182.332800 |
2 | 2.386700 | 2.281062 | 0.426300 | 0.142300 | 0.246800 | 0.246700 | 0.525200 | 143.990900 |
3 | 2.362000 | 2.274931 | 0.428400 | 0.143800 | 0.248300 | 0.248200 | 0.532000 | 139.585900 |
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.