license: apache-2.0
language:
- en
SatVision-Base A visual transformer trained with MODIS surface reflectance data
- Developed by: NASA GSFC CISTO Data Science Group
- Model type: Pre-trained visual transformer model
- License: Apache license 2.0
SatelliteVision-Base (SatVis-B) is a pre-trained vision transformer based on the SwinV2 model architecture. The model is pre-trained on global MODIS surface reflectance data from which 1.99 million image chips were used. SatVis-B is pre-trained using the masked-image-modeling (MIM) contrastive pre-training strategy. The MIM pre-training approach utilizes random masking of the input geospatial image chip, using a linear layer to regress the raw pixel values of the masked area with an l1 loss serving as the loss function.
Resolution of the pre-training MODIS chips was 128x128
with a window size of 16x16
. SatViz-B was pre-trained
for 800
epochs on 8x A100 GPUs and 12x V100 GPUs.
SatVision Transformer
Pre-trained models pre-trained on MODIS-Small dataset
name | pre-train epochs | pre-train resolution | #params | #tiles | pre-trained model |
---|---|---|---|---|---|
SatVision-Base | 800 | 128x128 | 84.5 M | 2 M | checkpoint/config |
SatVision-Base | 100 | 128x128 | 84.5 M | 26 M | checkpoint/config |
Getting Started with SatVision-Base
- Training repository: https://github.com/nasa-nccs-hpda/pytorch-caney
- Pre-training dataset repository: https://huggingface.co/datasets/nasa-cisto-data-science-group/satvision-pretrain-small
Installation
If you have singularity installed
$ git clone [email protected]:nasa-nccs-hpda/pytorch-caney.git
$ singularity build --sandbox pytorch-caney.sif docker://nasanccs/pytorch-caney:latest
# To shell into the container
$ singularity shell --nv -B <mounts> pytorch-caney.sif
Anaconda installation
$ git clone [email protected]:nasa-nccs-hpda/pytorch-caney.git
$ conda create -n satvision-env python==3.9
Fine-tuning Satvision-Base
- Create config file example config
- Download checkpoint from this HF model repo
$ git clone [email protected]:nasa-nccs-hpda/pytorch-caney.git
- Add a new pytorch dataset in pytorch-caney/pytorch_caney/data/datasets/
- Add new pytorch dataset to dict in pytorch-caney/pytorch_caney/data/datamodules/finetune_datamodule.py
torchrun --nproc_per_node <NGPUS> pytorch-caney/pytorch_caney/pipelines/finetuning/finetune.py --cfg <config-file> --pretrained <path-to-pretrained> --dataset <dataset-name (key for new dataset)> --data-paths <path-to-data-dir> --batch-size <batch-size> --output <output-dir> --enable-amp
Pre-training with pytorch-caney
Pre-training with SatVision-Base with Masked Image Modeling and pytorch-caney
To pre-train the swinv2 base model with masked image modeling pre-training, run:
torchrun --nproc_per_node <NGPUS> pytorch-caney/pytorch_caney/pipelines/pretraining/mim.py --cfg <config-file> --dataset <dataset-name> --data-paths <path-to-data-subfolder-1> --batch-size <batch-size> --output <output-dir> --enable-amp
For example to run on a compute node with 4 GPUs and a batch size of 128 on the MODIS SatVision pre-training dataset with a base swinv2 model, run:
singularity shell --nv -B <mounts> /path/to/container/pytorch-caney-container
Singularity> export PYTHONPATH=$PWD:$PWD/pytorch-caney
Singularity> torchrun --nproc_per_node 4 pytorch-caney/pytorch_caney/pipelines/pretraining/mim.py --cfg pytorch-caney/examples/satvision/mim_pretrain_swinv2_satvision_base_192_window12_800ep.yaml --dataset MODIS --data-paths /explore/nobackup/projects/ilab/data/satvision/pretraining/training_* --batch-size 128 --output . --enable-amp
SatVision-Base Pre-Training Datasets
name | bands | resolution | #chips | meters-per-pixel |
---|---|---|---|---|
MODIS-Small | 7 | 128x128 | 1,994,131 | 500m |
Citing SatVision-Base
If this model helped your research, please cite satvision-base
in your publications.
@misc{satvision-base,
author = {Carroll, Mark and Li, Jian and Spradlin, Caleb and Caraballo-Vega, Jordan},
doi = {10.57967/hf/1017},
month = aug,
title = {{satvision-base}},
url = {https://huggingface.co/nasa-cisto-data-science-group/satvision-base},
repository-code = {https://github.com/nasa-nccs-hpda/pytorch-caney}
year = {2023}
}