update benchmark on A100

Browse files

Files changed (9) hide show

README.md +16 -12
configs/evaluate.json +132 -13
configs/metadata.json +4 -5
configs/multi_gpu_train.json +2 -2
configs/train.json +13 -8
docs/README.md +16 -12
models/model.pt +2 -2
models/stage0/model.pt +2 -2
scripts/prepare_patches.py +3 -0

README.md CHANGED Viewed

@@ -66,31 +66,35 @@ Output: a dictionary with the following keys:
 The achieved metrics on the validation data are:
 Fast mode:
-- Binary Dice: 0.8293
-- PQ: 0.4936
-- F1d: 0.7480
 #### Training Loss and Dice
 stage1:
-![A graph showing the training loss and the mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_train_stage1_fast.png)
 stage2:
-![A graph showing the training loss and the mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_train_stage2_fast.png)
 #### Validation Dice
 stage1:
-![A graph showing the validation mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_val_stage1_fast.png)
 stage2:
-![A graph showing the validation mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_val_stage2_fast.png)
 ## commands example
-Execute training:
 - Run first stage
@@ -101,7 +105,7 @@ python -m monai.bundle run --config_file configs/train.json --network_def#pretra
 - Run second stage
 ```
-python -m monai.bundle run --config_file configs/train.json --network_def#freeze_encoder false --network_def#pretrained_url None --stage 1
 ```
 Override the `train` config to execute multi-GPU training:
@@ -109,16 +113,16 @@ Override the `train` config to execute multi-GPU training:
 - Run first stage
 ```
-torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --train#dataloader#batch_size 8 --network_def#freeze_encoder true --network_def#pretrained_url `PRETRAIN_MODEL_URL` --stage 0
 ```
 - Run second stage
 ```
-torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --train#dataloader#batch_size 4 --network_def#freeze_encoder false --network_def#pretrained_url None --stage 1
 ```
-Override the `train` config to execute evaluation with the trained model:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"

 The achieved metrics on the validation data are:
 Fast mode:
+- Binary Dice: 0.8291
+- PQ: 0.4973
+- F1d: 0.7417
+Note: Binary Dice is calculated based on the whole input. PQ and F1d were calculated from https://github.com/vqdang/hover_net#inference.
+This bundle is non-deterministic, for more details please refer to https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html#torch.use_deterministic_algorithms
 #### Training Loss and Dice
 stage1:
+![A graph showing the training loss and the mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_train_stage0_v2.png)
 stage2:
+![A graph showing the training loss and the mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_train_stage1_v2.png)
 #### Validation Dice
 stage1:
+![A graph showing the validation mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_val_stage0_v2.png)
 stage2:
+![A graph showing the validation mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_val_stage1_v2.png)
 ## commands example
+Execute training, the evaluation in the training were evaluated on patches:
 - Run first stage
 - Run second stage
 ```
+python -m monai.bundle run --config_file configs/train.json --network_def#freeze_encoder False --network_def#pretrained_url None --stage 1
 ```
 Override the `train` config to execute multi-GPU training:
 - Run first stage
 ```
+torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --batch_size 8 --network_def#freeze_encoder True --network_def#pretrained_url `PRETRAIN_MODEL_URL --stage 0
 ```
 - Run second stage
 ```
+torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --batch_size 4 --network_def#freeze_encoder False --network_def#pretrained_url None --stage 1
 ```
+Override the `train` config to execute evaluation with the trained model, here we evaluated dice from the whole input instead of the patches:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"

configs/evaluate.json CHANGED Viewed

@@ -1,4 +1,7 @@
 {
     "network_def": {
         "_target_": "HoVerNet",
         "mode": "@hovernet_mode",
@@ -6,6 +9,77 @@
         "in_channels": 3,
         "out_classes": 5
     },
     "validate#handlers": [
         {
             "_target_": "CheckpointLoader",
@@ -16,21 +90,66 @@
         },
         {
             "_target_": "StatsHandler",
-            "iteration_log": false
-        },
-        {
-            "_target_": "MetricsSaver",
-            "save_dir": "@output_dir",
-            "metrics": [
-                "val_mean_dice"
-            ],
-            "metric_details": [
-                "val_mean_dice"
-            ],
-            "batch_transform": "$monai.handlers.from_engine(['image_meta_dict'])",
-            "summary_ops": "*"
         }
     ],
     "initialize": [
         "$setattr(torch.backends.cudnn, 'benchmark', True)"
     ],

 {
+    "val_images": "$list(sorted(glob.glob(@dataset_dir + '/Test/image*.npy')))",
+    "val_labels": "$list(sorted(glob.glob(@dataset_dir + '/Test/label*.npy')))",
+    "data_list": "$[{'image': i, 'label': j} for i, j in zip(@val_images, @val_labels)]",
     "network_def": {
         "_target_": "HoVerNet",
         "mode": "@hovernet_mode",
         "in_channels": 3,
         "out_classes": 5
     },
+    "sw_batch_size": 16,
+    "validate#dataset": {
+        "_target_": "CacheDataset",
+        "data": "@data_list",
+        "transform": "@validate#preprocessing",
+        "cache_rate": 1.0,
+        "num_workers": 4
+    },
+    "validate#preprocessing_transforms": [
+        {
+            "_target_": "LoadImaged",
+            "keys": [
+                "image",
+                "label"
+            ]
+        },
+        {
+            "_target_": "SplitDimd",
+            "keys": "label",
+            "output_postfixes": [
+                "inst",
+                "type"
+            ],
+            "dim": -1
+        },
+        {
+            "_target_": "EnsureChannelFirstd",
+            "keys": [
+                "image",
+                "label_inst",
+                "label_type"
+            ],
+            "channel_dim": -1
+        },
+        {
+            "_target_": "CastToTyped",
+            "keys": [
+                "image",
+                "label_inst"
+            ],
+            "dtype": "$torch.int"
+        },
+        {
+            "_target_": "ScaleIntensityRanged",
+            "keys": "image",
+            "a_min": 0.0,
+            "a_max": 255.0,
+            "b_min": 0.0,
+            "b_max": 1.0,
+            "clip": true
+        },
+        {
+            "_target_": "ComputeHoVerMapsd",
+            "keys": "label_inst"
+        },
+        {
+            "_target_": "Lambdad",
+            "keys": "label_inst",
+            "func": "$lambda x: x > 0",
+            "overwrite": "label"
+        },
+        {
+            "_target_": "CastToTyped",
+            "keys": [
+                "image",
+                "label_inst",
+                "label_type"
+            ],
+            "dtype": "$torch.float32"
+        }
+    ],
     "validate#handlers": [
         {
             "_target_": "CheckpointLoader",
         },
         {
             "_target_": "StatsHandler",
+            "output_transform": "$lambda x: None"
         }
     ],
+    "validate#inferer": {
+        "_target_": "SlidingWindowHoVerNetInferer",
+        "roi_size": "@patch_size",
+        "sw_batch_size": "@sw_batch_size",
+        "overlap": "$1.0 - float(@out_size) / float(@patch_size)",
+        "padding_mode": "constant",
+        "cval": 0,
+        "progress": true,
+        "extra_input_padding": "$((@patch_size - @out_size) // 2,) * 4"
+    },
+    "postprocessing_pred": {
+        "_target_": "Compose",
+        "transforms": [
+            {
+                "_target_": "HoVerNetInstanceMapPostProcessingd",
+                "sobel_kernel_size": 21,
+                "marker_threshold": 0.5,
+                "marker_radius": 2,
+                "device": "@device"
+            },
+            {
+                "_target_": "HoVerNetNuclearTypePostProcessingd",
+                "device": "@device"
+            },
+            {
+                "_target_": "SaveImaged",
+                "keys": "instance_map",
+                "meta_keys": "image_meta_dict",
+                "output_ext": ".nii.gz",
+                "output_dir": "@output_dir",
+                "output_postfix": "instance_map",
+                "output_dtype": "uint32",
+                "separate_folder": false
+            },
+            {
+                "_target_": "SaveImaged",
+                "keys": "type_map",
+                "meta_keys": "image_meta_dict",
+                "output_ext": ".nii.gz",
+                "output_dir": "@output_dir",
+                "output_postfix": "type_map",
+                "output_dtype": "uint8",
+                "separate_folder": false
+            },
+            {
+                "_target_": "Lambdad",
+                "keys": "instance_map",
+                "func": "$lambda x: x > 0",
+                "overwrite": "nucleus_prediction"
+            }
+        ]
+    },
+    "validate#postprocessing": {
+        "_target_": "Lambdad",
+        "keys": "pred",
+        "func": "@postprocessing_pred"
+    },
     "initialize": [
         "$setattr(torch.backends.cudnn, 'benchmark', True)"
     ],

configs/metadata.json CHANGED Viewed

@@ -1,14 +1,15 @@
 {
     "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_hovernet_20221124.json",
-    "version": "0.1.4",
     "changelog": {
         "0.1.4": "adapt to BundleWorkflow interface",
         "0.1.3": "add name tag",
         "0.1.2": "update the workflow figure",
         "0.1.1": "update to use monai 1.1.0",
         "0.1.0": "complete the model package"
     },
-    "monai_version": "1.2.0rc3",
     "pytorch_version": "1.13.1",
     "numpy_version": "1.22.2",
     "optional_packages_version": {
@@ -28,9 +29,7 @@
     "label_classes": "a dictionary contains binary nuclear segmentation, hover map and pixel-level classification",
     "pred_classes": "a dictionary contains scalar probability for binary nuclear segmentation, hover map and pixel-level classification",
     "eval_metrics": {
-        "Binary Dice": 0.8293,
-        "PQ": 0.4936,
-        "F1d": 0.748
     },
     "intended_use": "This is an example, not to be used for diagnostic purposes",
     "references": [

 {
     "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_hovernet_20221124.json",
+    "version": "0.1.5",
     "changelog": {
+        "0.1.5": "update benchmark on A100",
         "0.1.4": "adapt to BundleWorkflow interface",
         "0.1.3": "add name tag",
         "0.1.2": "update the workflow figure",
         "0.1.1": "update to use monai 1.1.0",
         "0.1.0": "complete the model package"
     },
+    "monai_version": "1.2.0rc4",
     "pytorch_version": "1.13.1",
     "numpy_version": "1.22.2",
     "optional_packages_version": {
     "label_classes": "a dictionary contains binary nuclear segmentation, hover map and pixel-level classification",
     "pred_classes": "a dictionary contains scalar probability for binary nuclear segmentation, hover map and pixel-level classification",
     "eval_metrics": {
+        "Binary Dice": 0.8291
     },
     "intended_use": "This is an example, not to be used for diagnostic purposes",
     "references": [

configs/multi_gpu_train.json CHANGED Viewed

@@ -15,7 +15,7 @@
     },
     "train#dataloader#sampler": "@train#sampler",
     "train#dataloader#shuffle": false,
-    "train#trainer#train_handlers": "$@train#train_handlers[: -2 if dist.get_rank() > 0 else None]",
     "validate#sampler": {
         "_target_": "DistributedSampler",
         "dataset": "@validate#dataset",
@@ -35,6 +35,6 @@
         "$@train#trainer.run()"
     ],
     "finalize": [
-        "$dist.destroy_process_group()"
     ]
 }

     },
     "train#dataloader#sampler": "@train#sampler",
     "train#dataloader#shuffle": false,
+    "train#trainer#train_handlers": "$@train#train_handlers[: -3 if dist.get_rank() > 0 else None]",
     "validate#sampler": {
         "_target_": "DistributedSampler",
         "dataset": "@validate#dataset",
         "$@train#trainer.run()"
     ],
     "finalize": [
+        "$dist.is_initialized() and dist.destroy_process_group()"
     ]
 }

configs/train.json CHANGED Viewed

@@ -19,6 +19,7 @@
     "device": "$torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')",
     "stage": 0,
     "epochs": 50,
     "val_interval": 1,
     "learning_rate": 0.0001,
     "amp": true,
@@ -32,7 +33,7 @@
         "in_channels": 3,
         "out_classes": 5,
         "adapt_standard_resnet": true,
-        "pretrained_url": "$None",
         "freeze_encoder": true
     },
     "network": "$@network_def.to(@device)",
@@ -195,7 +196,7 @@
                 "name": "ColorJitter",
                 "brightness": [
                     0.9,
-                    1.0
                 ],
                 "contrast": [
                     0.95,
@@ -272,14 +273,16 @@
             "transforms": "$@train#preprocessing_transforms"
         },
         "dataset": {
-            "_target_": "Dataset",
             "data": "$[{'image': i, 'label_inst': j, 'label_type': k} for i, j, k in zip(@train_images, @train_inst_map, @train_type_map)]",
-            "transform": "@train#preprocessing"
         },
         "dataloader": {
             "_target_": "DataLoader",
             "dataset": "@train#dataset",
-            "batch_size": 16,
             "shuffle": true,
             "num_workers": 4
         },
@@ -463,14 +466,16 @@
             "transforms": "$@validate#preprocessing_transforms"
         },
         "dataset": {
-            "_target_": "Dataset",
             "data": "$[{'image': i, 'label_inst': j, 'label_type': k} for i, j, k in zip(@val_images, @val_inst_map, @val_type_map)]",
-            "transform": "@validate#preprocessing"
         },
         "dataloader": {
             "_target_": "DataLoader",
             "dataset": "@validate#dataset",
-            "batch_size": 16,
             "shuffle": false,
             "num_workers": 4
         },

     "device": "$torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')",
     "stage": 0,
     "epochs": 50,
+    "batch_size": 16,
     "val_interval": 1,
     "learning_rate": 0.0001,
     "amp": true,
         "in_channels": 3,
         "out_classes": 5,
         "adapt_standard_resnet": true,
+        "pretrained_url": null,
         "freeze_encoder": true
     },
     "network": "$@network_def.to(@device)",
                 "name": "ColorJitter",
                 "brightness": [
                     0.9,
+                    1.1
                 ],
                 "contrast": [
                     0.95,
             "transforms": "$@train#preprocessing_transforms"
         },
         "dataset": {
+            "_target_": "CacheDataset",
             "data": "$[{'image': i, 'label_inst': j, 'label_type': k} for i, j, k in zip(@train_images, @train_inst_map, @train_type_map)]",
+            "transform": "@train#preprocessing",
+            "cache_rate": 1.0,
+            "num_workers": 4
         },
         "dataloader": {
             "_target_": "DataLoader",
             "dataset": "@train#dataset",
+            "batch_size": "@batch_size",
             "shuffle": true,
             "num_workers": 4
         },
             "transforms": "$@validate#preprocessing_transforms"
         },
         "dataset": {
+            "_target_": "CacheDataset",
             "data": "$[{'image': i, 'label_inst': j, 'label_type': k} for i, j, k in zip(@val_images, @val_inst_map, @val_type_map)]",
+            "transform": "@validate#preprocessing",
+            "cache_rate": 1.0,
+            "num_workers": 4
         },
         "dataloader": {
             "_target_": "DataLoader",
             "dataset": "@validate#dataset",
+            "batch_size": "@batch_size",
             "shuffle": false,
             "num_workers": 4
         },

docs/README.md CHANGED Viewed

@@ -59,31 +59,35 @@ Output: a dictionary with the following keys:
 The achieved metrics on the validation data are:
 Fast mode:
-- Binary Dice: 0.8293
-- PQ: 0.4936
-- F1d: 0.7480
 #### Training Loss and Dice
 stage1:
-![A graph showing the training loss and the mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_train_stage1_fast.png)
 stage2:
-![A graph showing the training loss and the mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_train_stage2_fast.png)
 #### Validation Dice
 stage1:
-![A graph showing the validation mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_val_stage1_fast.png)
 stage2:
-![A graph showing the validation mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_nuclei_seg_cls_val_stage2_fast.png)
 ## commands example
-Execute training:
 - Run first stage
@@ -94,7 +98,7 @@ python -m monai.bundle run --config_file configs/train.json --network_def#pretra
 - Run second stage
 ```
-python -m monai.bundle run --config_file configs/train.json --network_def#freeze_encoder false --network_def#pretrained_url None --stage 1
 ```
 Override the `train` config to execute multi-GPU training:
@@ -102,16 +106,16 @@ Override the `train` config to execute multi-GPU training:
 - Run first stage
 ```
-torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --train#dataloader#batch_size 8 --network_def#freeze_encoder true --network_def#pretrained_url `PRETRAIN_MODEL_URL` --stage 0
 ```
 - Run second stage
 ```
-torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --train#dataloader#batch_size 4 --network_def#freeze_encoder false --network_def#pretrained_url None --stage 1
 ```
-Override the `train` config to execute evaluation with the trained model:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"

 The achieved metrics on the validation data are:
 Fast mode:
+- Binary Dice: 0.8291
+- PQ: 0.4973
+- F1d: 0.7417
+Note: Binary Dice is calculated based on the whole input. PQ and F1d were calculated from https://github.com/vqdang/hover_net#inference.
+This bundle is non-deterministic, for more details please refer to https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html#torch.use_deterministic_algorithms
 #### Training Loss and Dice
 stage1:
+![A graph showing the training loss and the mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_train_stage0_v2.png)
 stage2:
+![A graph showing the training loss and the mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_train_stage1_v2.png)
 #### Validation Dice
 stage1:
+![A graph showing the validation mean dice over 50 epochs in stage1](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_val_stage0_v2.png)
 stage2:
+![A graph showing the validation mean dice over 50 epochs in stage2](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_segmentation_classification_val_stage1_v2.png)
 ## commands example
+Execute training, the evaluation in the training were evaluated on patches:
 - Run first stage
 - Run second stage
 ```
+python -m monai.bundle run --config_file configs/train.json --network_def#freeze_encoder False --network_def#pretrained_url None --stage 1
 ```
 Override the `train` config to execute multi-GPU training:
 - Run first stage
 ```
+torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --batch_size 8 --network_def#freeze_encoder True --network_def#pretrained_url `PRETRAIN_MODEL_URL --stage 0
 ```
 - Run second stage
 ```
+torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --batch_size 4 --network_def#freeze_encoder False --network_def#pretrained_url None --stage 1
 ```
+Override the `train` config to execute evaluation with the trained model, here we evaluated dice from the whole input instead of the patches:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"

models/model.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3aa9651e11eca4c17d89fe59b46b8a51c7899decb03df538f5092ee9e55967ef
-size 151214500

 version https://git-lfs.github.com/spec/v1
+oid sha256:f3c427cd3e97f40b77ff612205b706475edc1039d1b8de39afcaf7add204e39c
+size 151228832

models/stage0/model.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6a394593bc2ed0ef53188ffc96174ae492b59448a37b4b4a161214a30b173785
-size 151220093

 version https://git-lfs.github.com/spec/v1
+oid sha256:cf6eb5a0467422c2c1ffbff72e2b4aca17dcdd8d2087bd1a27ce86fea98a1ab6
+size 151228832

scripts/prepare_patches.py CHANGED Viewed

@@ -176,6 +176,9 @@ def main(cfg):
             img = load_img(f"{img_dir}/{base_name}.{cfg['image_suffix']}")
             ann = load_ann(f"{ann_dir}/{base_name}.{cfg['label_suffix']}")
             # *
             img = np.concatenate([img, ann], axis=-1)
             sub_patches = xtractor.extract(img, cfg["extract_type"])

             img = load_img(f"{img_dir}/{base_name}.{cfg['image_suffix']}")
             ann = load_ann(f"{ann_dir}/{base_name}.{cfg['label_suffix']}")
+            np.save("{0}/label_{1}.npy".format(out_dir, base_name), ann)
+            np.save("{0}/image_{1}.npy".format(out_dir, base_name), img)
             # *
             img = np.concatenate([img, ann], axis=-1)
             sub_patches = xtractor.extract(img, cfg["extract_type"])