strong meta vision-transformer architechture pretrain support
#13
by
JiangYi
- opened
MetaFormer
We are very happy to meet you here and to tell you about our work(MetaFormer). We use a unified framework with meta information to do fine-grained recognition task. We have achieved SOTA performance on INAT18/19/21 and CUB datasets.
###Fine-grained Datasets
Result on fine-grained datasets with different pre-trained model.
Name | Pretrain | CUB | NABirds | iNat2017 | iNat2018 | Cars | Aircraft |
---|---|---|---|---|---|---|---|
MetaFormer-0 | ImageNet-1k | 89.6 | 89.1 | 75.7 | 79.5 | 95.0 | 91.2 |
MetaFormer-0 | ImageNet-21k | 89.7 | 89.5 | 75.8 | 79.9 | 94.6 | 91.2 |
MetaFormer-0 | iNaturalist 2021 | 91.8 | 91.5 | 78.3 | 82.9 | 95.1 | 87.4 |
MetaFormer-1 | ImageNet-1k | 89.7 | 89.4 | 78.2 | 81.9 | 94.9 | 90.8 |
MetaFormer-1 | ImageNet-21k | 91.3 | 91.6 | 79.4 | 83.2 | 95.0 | 92.6 |
MetaFormer-1 | iNaturalist 2021 | 92.3 | 92.7 | 82.0 | 87.5 | 95.0 | 92.5 |
MetaFormer-2 | ImageNet-1k | 89.7 | 89.7 | 79.0 | 82.6 | 95.0 | 92.4 |
MetaFormer-2 | ImageNet-21k | 91.8 | 92.2 | 80.4 | 84.3 | 95.1 | 92.9 |
MetaFormer-2 | iNaturalist 2021 | 92.9 | 93.0 | 82.8 | 87.7 | 95.4 | 92.8 |
Results in iNaturalist 2019, iNaturalist 2018, and iNaturalist 2021 with meta-information.
Name | Pretrain | Meta added | iNat2017 | iNat2018 | iNat2021 |
---|---|---|---|---|---|
MetaFormer-0 | ImageNet-1k | N | 75.7 | 79.5 | 88.4 |
MetaFormer-0 | ImageNet-1k | Y | 79.8(+4.1) | 85.4(+5.9) | 92.6(+4.2) |
MetaFormer-1 | ImageNet-1k | N | 78.2 | 81.9 | 90.2 |
MetaFormer-1 | ImageNet-1k | Y | 81.3(+3.1) | 86.5(+4.6) | 93.4(+3.2) |
MetaFormer-2 | ImageNet-1k | N | 79.0 | 82.6 | 89.8 |
MetaFormer-2 | ImageNet-1k | Y | 82.0(+3.0) | 86.8(+4.2) | 93.2(+3.4) |
MetaFormer-2 | ImageNet-21k | N | 80.4 | 84.3 | 90.3 |
MetaFormer-2 | ImageNet-21k | Y | 83.4(+3.0) | 88.7(+4.4) | 93.6(+3.3) |
We also provide Imagenet 1k/22k and Inaturalist Pretrain models with various resolution, Welcome to use our pretrain models and codebase.
Model zoo
name | resolution | 1k model | 21k model | iNat21 model |
---|---|---|---|---|
MetaFormer-0 | 224x224 | metafg_0_1k_224 | metafg_0_21k_224 | - |
MetaFormer-1 | 224x224 | metafg_1_1k_224 | metafg_1_21k_224 | - |
MetaFormer-2 | 224x224 | metafg_2_1k_224 | metafg_2_21k_224 | - |
MetaFormer-0 | 384x384 | metafg_0_1k_384 | metafg_0_21k_384 | metafg_0_inat21_384 |
MetaFormer-1 | 384x384 | metafg_1_1k_384 | metafg_1_21k_384 | metafg_1_inat21_384 |
MetaFormer-2 | 384x384 | metafg_2_1k_384 | metafg_2_21k_384 | metafg_2_inat21_384 |
Repo : https://github.com/dqshuai/MetaFormer
paper link: https://arxiv.org/abs/2203.02751