Snarci commited on
Commit
f0d65e5
1 Parent(s): 5e22626

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -18,6 +18,30 @@ Vision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 2
18
 
19
  Finally the ViT was finetuned on the Chaoyang dataset at resolution 384x384, using a fixed 10% of the training set as the validation set and evaluated on the official test set using the best validation model based on the loss
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  # Results
22
 
23
  Our model represents the current state-of-the-art in the field, as it outperforms previous state-of-the-art models proposed in papers with code,
 
18
 
19
  Finally the ViT was finetuned on the Chaoyang dataset at resolution 384x384, using a fixed 10% of the training set as the validation set and evaluated on the official test set using the best validation model based on the loss
20
 
21
+ # Augmentation pipeline
22
+ To address the issue of class imbalance in our training set, we performed oversampling with repetition.
23
+ Specifically, we duplicated the minority classes images until we obtained an even distribution across all classes.
24
+ This resulted in a larger training set, but ensured that our model was exposed to an equal number of samples from each class during training.
25
+ We verified that this approach did not lead to overfitting or other issues by using a validation set with the original class distribution.
26
+ We used the following augmentation pipeline for our experiments:
27
+
28
+ A.Resize(img_size, img_size),
29
+ A.HorizontalFlip(p=0.5),
30
+ A.VerticalFlip(p=0.5),
31
+ A.RandomRotate90(p=0.5),
32
+ A.RandomResizedCrop(img_size, img_size, scale=(0.5, 1.0), p=0.5),
33
+ ToTensorV2(p=1.0)
34
+
35
+ This pipeline consists of the following transformations:
36
+
37
+ - Resize: resizes the image to a fixed size of (img_size, img_size).
38
+ - HorizontalFlip: flips the image horizontally with a probability of 0.5.
39
+ - VerticalFlip: flips the image vertically with a probability of 0.5.
40
+ - RandomRotate90: randomly rotates the image by 90, 180, or 270 degrees with a probability of 0.5.
41
+ - RandomResizedCrop: randomly crops and resizes the image to a size between 50% and 100% of the original size, with a probability of 0.5.
42
+ - ToTensorV2: converts the image to a PyTorch tensor.
43
+
44
+ These transformations were chosen to augment the dataset with a variety of geometric transformations, while preserving important visual features.
45
  # Results
46
 
47
  Our model represents the current state-of-the-art in the field, as it outperforms previous state-of-the-art models proposed in papers with code,