Edit model card

Supervised ViT-S/16 (small-sized Vision Transformer with patch size 16) model

ViT-S official model trained on ImageNet-1k for 100 epochs. Reproduced for ICCV 2023 SimPool paper.

SimPool is a simple attention-based pooling method at the end of network, released in this repository. Disclaimer: This model card is written by the author of SimPool, i.e. Bill Psomas.

BibTeX entry and citation info

@misc{psomas2023simpool,
      title={Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?}, 
      author={Bill Psomas and Ioannis Kakogeorgiou and Konstantinos Karantzalos and Yannis Avrithis},
      year={2023},
      eprint={2309.06891},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Dataset used to train billpsomas/vits_supervised_official_ep100

Collection including billpsomas/vits_supervised_official_ep100