--- license: apache-2.0 language: - en --- # SatVision-Base A visual transformer trained with MODIS surface reflectance data - **Developed by:** NASA GSFC CISTO Data Science Group - **Model type:** Pre-trained visual transformer model - **License:** Apache license 2.0 SatelliteVision-Base (SatVis-B) is a pre-trained vision transformer based on the SwinV2 model architecture. The model is pre-trained on global MODIS surface reflectance data from which 1.99 million image chips were used. SatVis-B is pre-trained using the masked-image-modeling (MIM) contrastive pre-training strategy. The MIM pre-training approach utilizes random masking of the input geospatial image chip, using a linear layer to regress the raw pixel values of the masked area with an l1 loss serving as the loss function. Resolution of the pre-training MODIS chips was `128x128` with a window size of `16x16`. SatViz-B was pre-trained for `800` epochs on 8x A100 GPUs and 12x V100 GPUs. ### SatVision Transformer **Pre-trained models pre-trained on MODIS-Small dataset** | name | pre-train epochs | pre-train resolution | #params | #tiles | pre-trained model | | :---: | :---: | :---: | :---: | :---: | :---: | | SatVision-Base | 800 | 128x128 | 84.5 M | 2 M | [checkpoint](https://huggingface.co/nasa-cisto-data-science-group/satvision-base/blob/main/satvision_84M_2M_800.pth)/[config](https://github.com/nasa-nccs-hpda/pytorch-caney/blob/develop/examples/satvision/mim_pretrain_swinv2_satvision_base_192_window12_800ep.yaml) | | SatVision-Base | 100 | 128x128 | 84.5 M | 26 M | [checkpoint](https://huggingface.co/nasa-cisto-data-science-group/satvision-base/blob/main/satvision_84M_26M_100.pth)/[config](https://github.com/nasa-nccs-hpda/pytorch-caney/blob/develop/examples/satvision/mim_pretrain_swinv2_satvision_base_192_window12_800ep.yaml) | ## Getting Started with SatVision-Base - **Training repository:** https://github.com/nasa-nccs-hpda/pytorch-caney - **Pre-training dataset repository:** https://huggingface.co/datasets/nasa-cisto-data-science-group/satvision-pretrain-small ### Installation If you have singularity installed ```bash $ git clone git@github.com:nasa-nccs-hpda/pytorch-caney.git $ singularity build --sandbox pytorch-caney.sif docker://nasanccs/pytorch-caney:latest # To shell into the container $ singularity shell --nv -B pytorch-caney.sif ``` Anaconda installation ```bash $ git clone git@github.com:nasa-nccs-hpda/pytorch-caney.git $ conda create -n satvision-env python==3.9 ``` ### Fine-tuning Satvision-Base - Create config file [example config](https://github.com/nasa-nccs-hpda/pytorch-caney/blob/finetuning/examples/satvision/finetune_satvision_base_landcover5class_192_window12_100ep.yaml) - Download checkpoint from this HF model repo - `$ git clone git@github.com:nasa-nccs-hpda/pytorch-caney.git` - Add a new pytorch dataset in pytorch-caney/pytorch_caney/data/datasets/ - Add new pytorch dataset to dict in pytorch-caney/pytorch_caney/data/datamodules/finetune_datamodule.py ```bash torchrun --nproc_per_node pytorch-caney/pytorch_caney/pipelines/finetuning/finetune.py --cfg --pretrained --dataset --data-paths --batch-size --output --enable-amp ``` ### Pre-training with pytorch-caney ## Pre-training with SatVision-Base with Masked Image Modeling and pytorch-caney To pre-train the swinv2 base model with masked image modeling pre-training, run: ```bash torchrun --nproc_per_node pytorch-caney/pytorch_caney/pipelines/pretraining/mim.py --cfg --dataset --data-paths --batch-size --output --enable-amp ``` For example to run on a compute node with 4 GPUs and a batch size of 128 on the MODIS SatVision pre-training dataset with a base swinv2 model, run: ```bash singularity shell --nv -B /path/to/container/pytorch-caney-container Singularity> export PYTHONPATH=$PWD:$PWD/pytorch-caney Singularity> torchrun --nproc_per_node 4 pytorch-caney/pytorch_caney/pipelines/pretraining/mim.py --cfg pytorch-caney/examples/satvision/mim_pretrain_swinv2_satvision_base_192_window12_800ep.yaml --dataset MODIS --data-paths /explore/nobackup/projects/ilab/data/satvision/pretraining/training_* --batch-size 128 --output . --enable-amp ``` ## SatVision-Base Pre-Training Datasets | name | bands | resolution | #chips | meters-per-pixel | | :---: | :---: | :---: | :---: | :---: | | MODIS-Small | 7 | 128x128 | 1,994,131 | 500m | ## Citing SatVision-Base If this model helped your research, please cite `satvision-base` in your publications. ``` @misc{satvision-base, author = {Carroll, Mark and Li, Jian and Spradlin, Caleb and Caraballo-Vega, Jordan}, doi = {10.57967/hf/1017}, month = aug, title = {{satvision-base}}, url = {https://huggingface.co/nasa-cisto-data-science-group/satvision-base}, repository-code = {https://github.com/nasa-nccs-hpda/pytorch-caney} year = {2023} } ```