Update README.md

8686168 over 1 year ago

6.17 kB

	---
	tags:
	- generated_from_trainer
	datasets:
	- imagefolder
	metrics:
	- accuracy
	- f1
	- recall
	- precision
	model-index:
	- name: dit-base-Business_Documents_Classified_v2
	results:
	- task:
	name: Image Classification
	type: image-classification
	dataset:
	name: imagefolder
	type: imagefolder
	config: data
	split: train
	args: data
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.826
	language:
	- en
	pipeline_tag: image-classification
	---

	# dit-base-Business_Documents_Classified_v2

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the imagefolder dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6715
	- Accuracy: 0.826
	- Weighted f1: 0.8272
	- Micro f1: 0.826
	- Macro f1: 0.8242
	- Weighted recall: 0.826
	- Micro recall: 0.826
	- Macro recall: 0.8237
	- Weighted precision: 0.8327
	- Micro precision: 0.826
	- Macro precision: 0.8293

	## Model description

	This is a classification model of 16 different types of documents.

	For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/blob/main/Document%20AI/Multiclass%20Classification/Real%20World%20Documents%20Collections/Real%20World%20Documents%20Collections_v2.ipynb

	## Intended uses & limitations

	This model is intended to demonstrate my ability to solve a complex problem using technology.

	## Training and evaluation data

	Dataset Source: https://www.kaggle.com/datasets/shaz13/real-world-documents-collections

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 18

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Weighted f1 \| Micro f1 \| Macro f1 \| Weighted recall \| Micro recall \| Macro recall \| Weighted precision \| Micro precision \| Macro precision \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:-----------:\|:--------:\|:--------:\|:---------------:\|:------------:\|:------------:\|:------------------:\|:---------------:\|:---------------:\|
	\| 2.7266 \| 0.99 \| 31 \| 2.4738 \| 0.208 \| 0.1811 \| 0.208 \| 0.1827 \| 0.208 \| 0.208 \| 0.2101 \| 0.2143 \| 0.208 \| 0.2246 \|
	\| 2.171 \| 1.98 \| 62 \| 1.8510 \| 0.423 \| 0.3936 \| 0.4230 \| 0.3925 \| 0.423 \| 0.423 \| 0.4243 \| 0.4503 \| 0.423 \| 0.4446 \|
	\| 1.6525 \| 2.98 \| 93 \| 1.2633 \| 0.61 \| 0.5884 \| 0.61 \| 0.5855 \| 0.61 \| 0.61 \| 0.6124 \| 0.6377 \| 0.61 \| 0.6283 \|
	\| 1.346 \| 4.0 \| 125 \| 1.0259 \| 0.706 \| 0.7023 \| 0.706 \| 0.6992 \| 0.706 \| 0.706 \| 0.7058 \| 0.7095 \| 0.706 \| 0.7034 \|
	\| 1.253 \| 4.99 \| 156 \| 0.9180 \| 0.729 \| 0.7277 \| 0.729 \| 0.7239 \| 0.729 \| 0.729 \| 0.7291 \| 0.7340 \| 0.729 \| 0.7261 \|
	\| 1.0975 \| 5.98 \| 187 \| 0.8859 \| 0.747 \| 0.7480 \| 0.747 \| 0.7437 \| 0.747 \| 0.747 \| 0.7472 \| 0.7609 \| 0.747 \| 0.7526 \|
	\| 1.1122 \| 6.98 \| 218 \| 0.8270 \| 0.76 \| 0.7606 \| 0.76 \| 0.7578 \| 0.76 \| 0.76 \| 0.7594 \| 0.7772 \| 0.76 \| 0.7727 \|
	\| 1.0365 \| 8.0 \| 250 \| 0.7806 \| 0.775 \| 0.7759 \| 0.775 \| 0.7730 \| 0.775 \| 0.775 \| 0.7735 \| 0.7957 \| 0.775 \| 0.7920 \|
	\| 1.004 \| 8.99 \| 281 \| 0.7472 \| 0.796 \| 0.7977 \| 0.796 \| 0.7957 \| 0.796 \| 0.796 \| 0.7956 \| 0.8193 \| 0.796 \| 0.8151 \|
	\| 0.9278 \| 9.98 \| 312 \| 0.7296 \| 0.795 \| 0.7974 \| 0.795 \| 0.7957 \| 0.795 \| 0.795 \| 0.7953 \| 0.8157 \| 0.795 \| 0.8115 \|
	\| 0.8767 \| 10.98 \| 343 \| 0.7257 \| 0.809 \| 0.8101 \| 0.809 \| 0.8078 \| 0.809 \| 0.809 \| 0.8091 \| 0.8182 \| 0.809 \| 0.8136 \|
	\| 0.8656 \| 12.0 \| 375 \| 0.6875 \| 0.814 \| 0.8137 \| 0.8140 \| 0.8106 \| 0.814 \| 0.814 \| 0.8122 \| 0.8207 \| 0.814 \| 0.8164 \|
	\| 0.7905 \| 12.99 \| 406 \| 0.7060 \| 0.808 \| 0.8093 \| 0.808 \| 0.8071 \| 0.808 \| 0.808 \| 0.8068 \| 0.8182 \| 0.808 \| 0.8145 \|
	\| 0.8804 \| 13.98 \| 437 \| 0.6849 \| 0.82 \| 0.8214 \| 0.82 \| 0.8183 \| 0.82 \| 0.82 \| 0.8183 \| 0.8260 \| 0.82 \| 0.8215 \|
	\| 0.8265 \| 14.98 \| 468 \| 0.6821 \| 0.816 \| 0.8171 \| 0.816 \| 0.8143 \| 0.816 \| 0.816 \| 0.8142 \| 0.8242 \| 0.816 \| 0.8206 \|
	\| 0.7929 \| 16.0 \| 500 \| 0.6877 \| 0.818 \| 0.8184 \| 0.818 \| 0.8152 \| 0.818 \| 0.818 \| 0.8167 \| 0.8240 \| 0.818 \| 0.8186 \|
	\| 0.7993 \| 16.99 \| 531 \| 0.6718 \| 0.825 \| 0.8259 \| 0.825 \| 0.8234 \| 0.825 \| 0.825 \| 0.8227 \| 0.8306 \| 0.825 \| 0.8282 \|
	\| 0.7954 \| 17.86 \| 558 \| 0.6715 \| 0.826 \| 0.8272 \| 0.826 \| 0.8242 \| 0.826 \| 0.826 \| 0.8237 \| 0.8327 \| 0.826 \| 0.8293 \|

	### Framework versions

	- Transformers 4.28.1
	- Pytorch 2.0.0+cu118
	- Datasets 2.11.0
	- Tokenizers 0.13.3