File size: 1,587 Bytes
158a7ed 52bf4a9 0c1d9bc 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed 52bf4a9 158a7ed |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
library_name: transformers
tags:
- vit
- cifar10
- image classification
license: apache-2.0
datasets:
- uoft-cs/cifar10
language:
- en
metrics:
- accuracy
- perplexity
pipeline_tag: image-classification
widget:
- src: ./deer_224x224.png
---
## Model Details
### Model Description
An adapter for the [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224) ViT trained on CIFAR10 classification task
## Loading guide
```py
from transformers import AutoModelForImageClassification
labels2title = ['plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
model = AutoModelForImageClassification.from_pretrained(
'google/vit-base-patch16-224-in21k',
num_labels=len(labels2title),
id2label={i: c for i, c in enumerate(labels2title)},
label2id={c: i for i, c in enumerate(labels2title)}
)
model.load_adapter("yturkunov/cifar10_vit16_lora")
```
## Learning curves
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655221be7bd4634260e032ca/Ji1ewA_8T1rJuQkdNCIXQ.png)
### Recommendations to input
The model expects an image that has went through the following preprocessing stages:
* Scaling range:
<img src="https://latex.codecogs.com/gif.latex?[0, 255]\rightarrow[0, 1]" />
* Normalization parameters:
<img src="https://latex.codecogs.com/gif.latex?\mu=(.5,.5,.5),\sigma=(.5,.5,.5)" />
* Dimensions: 224x224
* Number of channels: 3
### Inference on 3x4 random sample
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655221be7bd4634260e032ca/zxj9ID37gJJnkmc8Sl97A.png)
|