exper7_mesum5 / README.md

update model card README.md

a8a8b69 over 2 years ago

3.99 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: exper7_mesum5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# exper7_mesum5

	This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5954
	- Accuracy: 0.8538

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 4.2072 \| 0.23 \| 100 \| 4.1532 \| 0.1923 \|
	\| 3.5433 \| 0.47 \| 200 \| 3.5680 \| 0.2888 \|
	\| 3.1388 \| 0.7 \| 300 \| 3.1202 \| 0.3911 \|
	\| 2.7924 \| 0.93 \| 400 \| 2.7434 \| 0.4787 \|
	\| 2.1269 \| 1.16 \| 500 \| 2.3262 \| 0.5781 \|
	\| 1.8589 \| 1.4 \| 600 \| 1.9754 \| 0.6272 \|
	\| 1.7155 \| 1.63 \| 700 \| 1.7627 \| 0.6840 \|
	\| 1.4689 \| 1.86 \| 800 \| 1.5937 \| 0.6994 \|
	\| 1.0149 \| 2.09 \| 900 \| 1.3168 \| 0.7497 \|
	\| 0.8148 \| 2.33 \| 1000 \| 1.1630 \| 0.7615 \|
	\| 0.7159 \| 2.56 \| 1100 \| 1.0869 \| 0.7675 \|
	\| 0.7257 \| 2.79 \| 1200 \| 0.9607 \| 0.7893 \|
	\| 0.4171 \| 3.02 \| 1300 \| 0.8835 \| 0.7935 \|
	\| 0.2969 \| 3.26 \| 1400 \| 0.8259 \| 0.8130 \|
	\| 0.2405 \| 3.49 \| 1500 \| 0.7711 \| 0.8142 \|
	\| 0.2948 \| 3.72 \| 1600 \| 0.7629 \| 0.8112 \|
	\| 0.1765 \| 3.95 \| 1700 \| 0.7117 \| 0.8124 \|
	\| 0.1603 \| 4.19 \| 1800 \| 0.6946 \| 0.8237 \|
	\| 0.0955 \| 4.42 \| 1900 \| 0.6597 \| 0.8349 \|
	\| 0.0769 \| 4.65 \| 2000 \| 0.6531 \| 0.8266 \|
	\| 0.0816 \| 4.88 \| 2100 \| 0.6335 \| 0.8337 \|
	\| 0.0315 \| 5.12 \| 2200 \| 0.6087 \| 0.8402 \|
	\| 0.0368 \| 5.35 \| 2300 \| 0.6026 \| 0.8444 \|
	\| 0.0377 \| 5.58 \| 2400 \| 0.6450 \| 0.8278 \|
	\| 0.0603 \| 5.81 \| 2500 \| 0.6564 \| 0.8343 \|
	\| 0.0205 \| 6.05 \| 2600 \| 0.6119 \| 0.8467 \|
	\| 0.019 \| 6.28 \| 2700 \| 0.6070 \| 0.8479 \|
	\| 0.0249 \| 6.51 \| 2800 \| 0.6002 \| 0.8538 \|
	\| 0.0145 \| 6.74 \| 2900 \| 0.6012 \| 0.8497 \|
	\| 0.0134 \| 6.98 \| 3000 \| 0.5991 \| 0.8521 \|
	\| 0.0271 \| 7.21 \| 3100 \| 0.5972 \| 0.8503 \|
	\| 0.0128 \| 7.44 \| 3200 \| 0.5911 \| 0.8521 \|
	\| 0.0123 \| 7.67 \| 3300 \| 0.5889 \| 0.8538 \|
	\| 0.0278 \| 7.91 \| 3400 \| 0.6135 \| 0.8491 \|
	\| 0.0106 \| 8.14 \| 3500 \| 0.5934 \| 0.8533 \|
	\| 0.0109 \| 8.37 \| 3600 \| 0.5929 \| 0.8533 \|
	\| 0.0095 \| 8.6 \| 3700 \| 0.5953 \| 0.8550 \|
	\| 0.009 \| 8.84 \| 3800 \| 0.5933 \| 0.8574 \|
	\| 0.009 \| 9.07 \| 3900 \| 0.5948 \| 0.8550 \|
	\| 0.0089 \| 9.3 \| 4000 \| 0.5953 \| 0.8556 \|
	\| 0.0086 \| 9.53 \| 4100 \| 0.5956 \| 0.8544 \|
	\| 0.0085 \| 9.77 \| 4200 \| 0.5955 \| 0.8556 \|
	\| 0.0087 \| 10.0 \| 4300 \| 0.5954 \| 0.8538 \|


	### Framework versions

	- Transformers 4.20.1
	- Pytorch 1.12.0+cu113
	- Datasets 2.3.2
	- Tokenizers 0.12.1

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: exper7_mesum5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# exper7_mesum5

	This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5954
	- Accuracy: 0.8538

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 4.2072 \| 0.23 \| 100 \| 4.1532 \| 0.1923 \|
	\| 3.5433 \| 0.47 \| 200 \| 3.5680 \| 0.2888 \|
	\| 3.1388 \| 0.7 \| 300 \| 3.1202 \| 0.3911 \|
	\| 2.7924 \| 0.93 \| 400 \| 2.7434 \| 0.4787 \|
	\| 2.1269 \| 1.16 \| 500 \| 2.3262 \| 0.5781 \|
	\| 1.8589 \| 1.4 \| 600 \| 1.9754 \| 0.6272 \|
	\| 1.7155 \| 1.63 \| 700 \| 1.7627 \| 0.6840 \|
	\| 1.4689 \| 1.86 \| 800 \| 1.5937 \| 0.6994 \|
	\| 1.0149 \| 2.09 \| 900 \| 1.3168 \| 0.7497 \|
	\| 0.8148 \| 2.33 \| 1000 \| 1.1630 \| 0.7615 \|
	\| 0.7159 \| 2.56 \| 1100 \| 1.0869 \| 0.7675 \|
	\| 0.7257 \| 2.79 \| 1200 \| 0.9607 \| 0.7893 \|
	\| 0.4171 \| 3.02 \| 1300 \| 0.8835 \| 0.7935 \|
	\| 0.2969 \| 3.26 \| 1400 \| 0.8259 \| 0.8130 \|
	\| 0.2405 \| 3.49 \| 1500 \| 0.7711 \| 0.8142 \|
	\| 0.2948 \| 3.72 \| 1600 \| 0.7629 \| 0.8112 \|
	\| 0.1765 \| 3.95 \| 1700 \| 0.7117 \| 0.8124 \|
	\| 0.1603 \| 4.19 \| 1800 \| 0.6946 \| 0.8237 \|
	\| 0.0955 \| 4.42 \| 1900 \| 0.6597 \| 0.8349 \|
	\| 0.0769 \| 4.65 \| 2000 \| 0.6531 \| 0.8266 \|
	\| 0.0816 \| 4.88 \| 2100 \| 0.6335 \| 0.8337 \|
	\| 0.0315 \| 5.12 \| 2200 \| 0.6087 \| 0.8402 \|
	\| 0.0368 \| 5.35 \| 2300 \| 0.6026 \| 0.8444 \|
	\| 0.0377 \| 5.58 \| 2400 \| 0.6450 \| 0.8278 \|
	\| 0.0603 \| 5.81 \| 2500 \| 0.6564 \| 0.8343 \|
	\| 0.0205 \| 6.05 \| 2600 \| 0.6119 \| 0.8467 \|
	\| 0.019 \| 6.28 \| 2700 \| 0.6070 \| 0.8479 \|
	\| 0.0249 \| 6.51 \| 2800 \| 0.6002 \| 0.8538 \|
	\| 0.0145 \| 6.74 \| 2900 \| 0.6012 \| 0.8497 \|
	\| 0.0134 \| 6.98 \| 3000 \| 0.5991 \| 0.8521 \|
	\| 0.0271 \| 7.21 \| 3100 \| 0.5972 \| 0.8503 \|
	\| 0.0128 \| 7.44 \| 3200 \| 0.5911 \| 0.8521 \|
	\| 0.0123 \| 7.67 \| 3300 \| 0.5889 \| 0.8538 \|
	\| 0.0278 \| 7.91 \| 3400 \| 0.6135 \| 0.8491 \|
	\| 0.0106 \| 8.14 \| 3500 \| 0.5934 \| 0.8533 \|
	\| 0.0109 \| 8.37 \| 3600 \| 0.5929 \| 0.8533 \|
	\| 0.0095 \| 8.6 \| 3700 \| 0.5953 \| 0.8550 \|
	\| 0.009 \| 8.84 \| 3800 \| 0.5933 \| 0.8574 \|
	\| 0.009 \| 9.07 \| 3900 \| 0.5948 \| 0.8550 \|
	\| 0.0089 \| 9.3 \| 4000 \| 0.5953 \| 0.8556 \|
	\| 0.0086 \| 9.53 \| 4100 \| 0.5956 \| 0.8544 \|
	\| 0.0085 \| 9.77 \| 4200 \| 0.5955 \| 0.8556 \|
	\| 0.0087 \| 10.0 \| 4300 \| 0.5954 \| 0.8538 \|


	### Framework versions

	- Transformers 4.20.1
	- Pytorch 1.12.0+cu113
	- Datasets 2.3.2
	- Tokenizers 0.12.1