---
license: apache-2.0
tags:
- generated_from_trainer
base_model: nlpconnect/vit-gpt2-image-captioning
metrics:
- rouge
model-index:
- name: Vit-GPT2-COCO2017Flickr-40k-05
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Vit-GPT2-COCO2017Flickr-40k-05

This model is a fine-tuned version of [nlpconnect/vit-gpt2-image-captioning](https://huggingface.co/nlpconnect/vit-gpt2-image-captioning) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5528
- Rouge1: 44.1624
- Rouge2: 19.6736
- Rougel: 40.3898
- Rougelsum: 40.4029
- Gen Len: 12.263

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 0.1497        | 0.1   | 500   | 0.5462          | 40.1774 | 14.6199 | 36.3335 | 36.3518   | 12.5965 |
| 0.1604        | 0.2   | 1000  | 0.5302          | 41.4714 | 16.0237 | 37.5992 | 37.5915   | 11.914  |
| 0.1631        | 0.3   | 1500  | 0.5436          | 40.3816 | 14.6958 | 36.6109 | 36.6027   | 12.3295 |
| 0.1634        | 0.4   | 2000  | 0.5266          | 40.9484 | 15.9068 | 37.5194 | 37.5088   | 12.033  |
| 0.1576        | 0.5   | 2500  | 0.5544          | 40.373  | 15.012  | 36.5218 | 36.5141   | 12.3345 |
| 0.1599        | 0.6   | 3000  | 0.5425          | 40.7552 | 15.2754 | 37.1059 | 37.1299   | 12.191  |
| 0.291         | 0.7   | 3500  | 0.4545          | 41.5934 | 16.251  | 37.7291 | 37.7113   | 12.0295 |
| 0.2825        | 0.8   | 4000  | 0.4558          | 42.6728 | 17.1703 | 38.8692 | 38.8841   | 12.246  |
| 0.2737        | 0.9   | 4500  | 0.4565          | 43.0036 | 16.8421 | 39.1761 | 39.1693   | 11.7975 |
| 0.2683        | 1.0   | 5000  | 0.4576          | 42.1341 | 16.7973 | 38.2881 | 38.3083   | 11.8655 |
| 0.1687        | 1.1   | 5500  | 0.4996          | 41.7152 | 16.4042 | 37.7724 | 37.7629   | 12.384  |
| 0.168         | 1.2   | 6000  | 0.5046          | 41.6521 | 16.6159 | 37.7915 | 37.7778   | 12.661  |
| 0.1688        | 1.3   | 6500  | 0.5020          | 42.3292 | 17.1408 | 38.5407 | 38.5282   | 11.846  |
| 0.1682        | 1.4   | 7000  | 0.5045          | 42.848  | 17.6905 | 38.9854 | 38.9896   | 12.025  |
| 0.1703        | 1.5   | 7500  | 0.5103          | 42.1175 | 16.7765 | 38.3023 | 38.3199   | 12.4315 |
| 0.1618        | 1.6   | 8000  | 0.5019          | 43.207  | 17.8145 | 39.3822 | 39.3884   | 12.3485 |
| 0.1657        | 1.7   | 8500  | 0.4945          | 42.8399 | 17.8975 | 39.1618 | 39.1951   | 11.8575 |
| 0.1643        | 1.8   | 9000  | 0.5064          | 43.0186 | 17.8969 | 39.2518 | 39.2735   | 12.0095 |
| 0.1654        | 1.9   | 9500  | 0.5011          | 43.2785 | 18.2603 | 39.4479 | 39.4437   | 12.2305 |
| 0.158         | 2.0   | 10000 | 0.4945          | 43.3824 | 18.3183 | 39.3471 | 39.3334   | 12.1495 |
| 0.1096        | 2.1   | 10500 | 0.5520          | 43.5068 | 18.4313 | 39.7084 | 39.7205   | 12.112  |
| 0.1037        | 2.2   | 11000 | 0.5510          | 43.1909 | 18.1204 | 39.1945 | 39.2052   | 12.349  |
| 0.1045        | 2.3   | 11500 | 0.5453          | 42.9965 | 18.4064 | 39.0931 | 39.0868   | 12.1825 |
| 0.1027        | 2.4   | 12000 | 0.5473          | 43.4973 | 18.8697 | 39.944  | 39.9407   | 12.447  |
| 0.1034        | 2.5   | 12500 | 0.5512          | 43.9534 | 19.327  | 40.0946 | 40.0724   | 12.2395 |
| 0.1018        | 2.6   | 13000 | 0.5527          | 43.7136 | 19.1214 | 39.9218 | 39.9274   | 12.3245 |
| 0.0986        | 2.7   | 13500 | 0.5557          | 44.0502 | 19.3213 | 40.0291 | 40.0286   | 12.3345 |
| 0.0953        | 2.8   | 14000 | 0.5510          | 44.0001 | 19.4482 | 40.1204 | 40.1175   | 12.1255 |
| 0.098         | 2.9   | 14500 | 0.5534          | 43.9554 | 19.4673 | 40.1401 | 40.1521   | 12.2395 |
| 0.0947        | 3.0   | 15000 | 0.5528          | 44.1624 | 19.6736 | 40.3898 | 40.4029   | 12.263  |


### Framework versions

- Transformers 4.39.3
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.2