metadata
license: bsd-3-clause
datasets:
- ILSVRC/imagenet-1k
pipeline_tag: image-classification
Model Card
ImageNet-1k Swin-Transformer pre-trained model with Rotary Position Embedding
Rotary Position Embedding for Vision Transformer [ECCV 2024]
- Repository: https://github.com/naver-ai/rope-vit
- Paper: https://arxiv.org/abs/2403.13298
Citation
@inproceedings{heo2024ropevit,
title={Rotary Position Embedding for Vision Transformer},
author={Heo, Byeongho and Park, Song and Han, Dongyoon and Yun, Sangdoo},
year={2024},
booktitle={European Conference on Computer Vision (ECCV)},
}