Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=3003,
training_loss=2.0113779983241042,
metrics={'train_runtime': 12268.4376,
'train_samples_per_second': 3.427,
'train_steps_per_second': 0.245,
'total_flos': 1.2147019450889011e+17,
'train_loss': 2.0113779983241042,
'epoch': 3.0}
Training Results
Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|
1 | 2.035800 | 1.906599 | 0.365400 | 0.150500 | 0.243200 | 0.243500 | 0.366300 | 227.230300 |
2 | 1.976100 | 1.878923 | 0.393700 | 0.167800 | 0.263500 | 0.263800 | 0.423600 | 193.114200 |
3 | 1.956800 | 1.871454 | 0.409300 | 0.175100 | 0.273400 | 0.273600 | 0.457000 | 172.294500 |
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.