monai
medical
katielink commited on
Commit
9984ad0
1 Parent(s): 2ee6428

add lesion FROC calculation and wsi_reader

Browse files
README.md CHANGED
@@ -29,9 +29,9 @@ Annotation information are adopted from [NCRF/jsons](https://github.com/baidu-re
29
 
30
  ### Data Preparation
31
 
32
- This MMAR expects the training/validation data (whole slide images) reside in `$DATA_ROOT/training/images`. By default `$DATA_ROOT` is pointing to `/workspace/data/medical/pathology/` You can easily modify `$DATA_ROOT` to point to a different directory in `config/environment.json`.
33
 
34
- To reduce the computation burden during the inference, patches are extracted only where there is tissue and ignoring the background according to a tissue mask. You should run `prepare_inference_data.sh` prior to the inference to generate foreground masks, where the input is the whole slide test images and the output is the foreground masks. Please also create a directory for prediction output, aligning with the one specified with `$MMAR_EVAL_OUTPUT_PATH` in `config/environment.json` (e.g. `/eval`)
35
 
36
  Please refer to "Annotation" section of [Camelyon challenge](https://camelyon17.grand-challenge.org/Data/) to prepare ground truth images, which are needed for FROC computation. By default, this data set is expected to be at `/workspace/data/medical/pathology/ground_truths`. But it can be modified in `evaluate_froc.sh`.
37
 
@@ -39,13 +39,14 @@ Please refer to "Annotation" section of [Camelyon challenge](https://camelyon17.
39
 
40
  The training was performed with the following:
41
 
42
- - Script: train.sh
43
  - GPU: at least 16 GB of GPU memory.
44
  - Actual Model Input: 224 x 224 x 3
45
  - AMP: True
46
  - Optimizer: Novograd
47
  - Learning Rate: 1e-3
48
  - Loss: BCEWithLogitsLoss
 
49
 
50
  ## Input
51
 
@@ -104,21 +105,12 @@ Export checkpoint to TorchScript file:
104
 
105
  TorchScript conversion is currently not supported.
106
 
107
- # Intended Use
108
-
109
- The model needs to be used with NVIDIA hardware and software. For hardware, the model can run on any NVIDIA GPU with memory greater than 16 GB. For software, this model is usable only as part of Transfer Learning & Annotation Tools in Clara Train SDK container. Find out more about Clara Train at the [Clara Train Collections on NGC](https://ngc.nvidia.com/catalog/collections/nvidia:claratrainframework).
110
-
111
- **The pre-trained models are for developmental purposes only and cannot be used directly for clinical procedures.**
112
-
113
- # License
114
-
115
- [End User License Agreement](https://developer.nvidia.com/clara-train-eula) is included with the product. Licenses are also available along with the model application zip file. By pulling and using the Clara Train SDK container and downloading models, you accept the terms and conditions of these licenses.
116
-
117
  # References
118
 
119
  [1] He, Kaiming, et al, "Deep Residual Learning for Image Recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016. <https://arxiv.org/pdf/1512.03385.pdf>
120
 
121
  # License
 
122
  Copyright (c) MONAI Consortium
123
 
124
  Licensed under the Apache License, Version 2.0 (the "License");
 
29
 
30
  ### Data Preparation
31
 
32
+ This bundle expects the training/validation data (whole slide images) reside in a `{data_root}/training/images`. By default `data_root` is pointing to `/workspace/data/medical/pathology/` You can modify `data_root` in the bundle config files to point to a different directory.
33
 
34
+ To reduce the computation burden during the inference, patches are extracted only where there is tissue and ignoring the background according to a tissue mask. Please also create a directory for prediction output. By default `output_dir` is set to `eval` folder under the bundle root.
35
 
36
  Please refer to "Annotation" section of [Camelyon challenge](https://camelyon17.grand-challenge.org/Data/) to prepare ground truth images, which are needed for FROC computation. By default, this data set is expected to be at `/workspace/data/medical/pathology/ground_truths`. But it can be modified in `evaluate_froc.sh`.
37
 
 
39
 
40
  The training was performed with the following:
41
 
42
+ - Config file: train.config
43
  - GPU: at least 16 GB of GPU memory.
44
  - Actual Model Input: 224 x 224 x 3
45
  - AMP: True
46
  - Optimizer: Novograd
47
  - Learning Rate: 1e-3
48
  - Loss: BCEWithLogitsLoss
49
+ - Whole slide image reader: cuCIM (if running on Windows or Mac, please install `OpenSlide` on your system and change `wsi_reader` to "OpenSlide")
50
 
51
  ## Input
52
 
 
105
 
106
  TorchScript conversion is currently not supported.
107
 
 
 
 
 
 
 
 
 
 
 
108
  # References
109
 
110
  [1] He, Kaiming, et al, "Deep Residual Learning for Image Recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016. <https://arxiv.org/pdf/1512.03385.pdf>
111
 
112
  # License
113
+
114
  Copyright (c) MONAI Consortium
115
 
116
  Licensed under the Apache License, Version 2.0 (the "License");
configs/inference.json CHANGED
@@ -7,6 +7,7 @@
7
  "output_dir": "$os.path.join(@bundle_root, 'eval')",
8
  "dataset_dir": "/workspace/data/medical/pathology",
9
  "testing_file": "$os.path.join(@bundle_root, 'testing.csv')",
 
10
  "patch_size": [
11
  224,
12
  224
@@ -63,7 +64,8 @@
63
  "data": "@datalist",
64
  "mask_level": 6,
65
  "patch_size": "@patch_size",
66
- "transform": "@preprocessing"
 
67
  },
68
  "dataloader": {
69
  "_target_": "DataLoader",
 
7
  "output_dir": "$os.path.join(@bundle_root, 'eval')",
8
  "dataset_dir": "/workspace/data/medical/pathology",
9
  "testing_file": "$os.path.join(@bundle_root, 'testing.csv')",
10
+ "wsi_reader": "cuCIM",
11
  "patch_size": [
12
  224,
13
  224
 
64
  "data": "@datalist",
65
  "mask_level": 6,
66
  "patch_size": "@patch_size",
67
+ "transform": "@preprocessing",
68
+ "reader": "@wsi_reader"
69
  },
70
  "dataloader": {
71
  "_target_": "DataLoader",
configs/metadata.json CHANGED
@@ -1,7 +1,8 @@
1
  {
2
  "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
3
- "version": "0.3.3",
4
  "changelog": {
 
5
  "0.3.3": "update to use monai 1.0.1",
6
  "0.3.2": "enhance readme on commands example",
7
  "0.3.1": "fix license Copyright error",
 
1
  {
2
  "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
3
+ "version": "0.4.0",
4
  "changelog": {
5
+ "0.4.0": "add lesion FROC calculation and wsi_reader",
6
  "0.3.3": "update to use monai 1.0.1",
7
  "0.3.2": "enhance readme on commands example",
8
  "0.3.1": "fix license Copyright error",
configs/train.json CHANGED
@@ -12,6 +12,7 @@
12
  "training_file": "$os.path.join(@bundle_root, 'training.csv')",
13
  "validation_file": "$os.path.join(@bundle_root, 'validation.csv')",
14
  "data_root": "/workspace/data/medical/pathology",
 
15
  "region_size": [
16
  768,
17
  768
@@ -166,7 +167,7 @@
166
  "data": "@train#datalist",
167
  "patch_level": 0,
168
  "patch_size": "@region_size",
169
- "reader": "cucim",
170
  "transform": "@train#preprocessing"
171
  },
172
  "dataloader": {
@@ -317,7 +318,7 @@
317
  "data": "@validate#datalist",
318
  "patch_level": 0,
319
  "patch_size": "@region_size",
320
- "reader": "cucim",
321
  "transform": "@validate#preprocessing"
322
  },
323
  "dataloader": {
 
12
  "training_file": "$os.path.join(@bundle_root, 'training.csv')",
13
  "validation_file": "$os.path.join(@bundle_root, 'validation.csv')",
14
  "data_root": "/workspace/data/medical/pathology",
15
+ "wsi_reader": "cuCIM",
16
  "region_size": [
17
  768,
18
  768
 
167
  "data": "@train#datalist",
168
  "patch_level": 0,
169
  "patch_size": "@region_size",
170
+ "reader": "@wsi_reader",
171
  "transform": "@train#preprocessing"
172
  },
173
  "dataloader": {
 
318
  "data": "@validate#datalist",
319
  "patch_level": 0,
320
  "patch_size": "@region_size",
321
+ "reader": "@wsi_reader",
322
  "transform": "@validate#preprocessing"
323
  },
324
  "dataloader": {
docs/README.md CHANGED
@@ -22,9 +22,9 @@ Annotation information are adopted from [NCRF/jsons](https://github.com/baidu-re
22
 
23
  ### Data Preparation
24
 
25
- This MMAR expects the training/validation data (whole slide images) reside in `$DATA_ROOT/training/images`. By default `$DATA_ROOT` is pointing to `/workspace/data/medical/pathology/` You can easily modify `$DATA_ROOT` to point to a different directory in `config/environment.json`.
26
 
27
- To reduce the computation burden during the inference, patches are extracted only where there is tissue and ignoring the background according to a tissue mask. You should run `prepare_inference_data.sh` prior to the inference to generate foreground masks, where the input is the whole slide test images and the output is the foreground masks. Please also create a directory for prediction output, aligning with the one specified with `$MMAR_EVAL_OUTPUT_PATH` in `config/environment.json` (e.g. `/eval`)
28
 
29
  Please refer to "Annotation" section of [Camelyon challenge](https://camelyon17.grand-challenge.org/Data/) to prepare ground truth images, which are needed for FROC computation. By default, this data set is expected to be at `/workspace/data/medical/pathology/ground_truths`. But it can be modified in `evaluate_froc.sh`.
30
 
@@ -32,13 +32,14 @@ Please refer to "Annotation" section of [Camelyon challenge](https://camelyon17.
32
 
33
  The training was performed with the following:
34
 
35
- - Script: train.sh
36
  - GPU: at least 16 GB of GPU memory.
37
  - Actual Model Input: 224 x 224 x 3
38
  - AMP: True
39
  - Optimizer: Novograd
40
  - Learning Rate: 1e-3
41
  - Loss: BCEWithLogitsLoss
 
42
 
43
  ## Input
44
 
@@ -97,21 +98,12 @@ Export checkpoint to TorchScript file:
97
 
98
  TorchScript conversion is currently not supported.
99
 
100
- # Intended Use
101
-
102
- The model needs to be used with NVIDIA hardware and software. For hardware, the model can run on any NVIDIA GPU with memory greater than 16 GB. For software, this model is usable only as part of Transfer Learning & Annotation Tools in Clara Train SDK container. Find out more about Clara Train at the [Clara Train Collections on NGC](https://ngc.nvidia.com/catalog/collections/nvidia:claratrainframework).
103
-
104
- **The pre-trained models are for developmental purposes only and cannot be used directly for clinical procedures.**
105
-
106
- # License
107
-
108
- [End User License Agreement](https://developer.nvidia.com/clara-train-eula) is included with the product. Licenses are also available along with the model application zip file. By pulling and using the Clara Train SDK container and downloading models, you accept the terms and conditions of these licenses.
109
-
110
  # References
111
 
112
  [1] He, Kaiming, et al, "Deep Residual Learning for Image Recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016. <https://arxiv.org/pdf/1512.03385.pdf>
113
 
114
  # License
 
115
  Copyright (c) MONAI Consortium
116
 
117
  Licensed under the Apache License, Version 2.0 (the "License");
 
22
 
23
  ### Data Preparation
24
 
25
+ This bundle expects the training/validation data (whole slide images) reside in a `{data_root}/training/images`. By default `data_root` is pointing to `/workspace/data/medical/pathology/` You can modify `data_root` in the bundle config files to point to a different directory.
26
 
27
+ To reduce the computation burden during the inference, patches are extracted only where there is tissue and ignoring the background according to a tissue mask. Please also create a directory for prediction output. By default `output_dir` is set to `eval` folder under the bundle root.
28
 
29
  Please refer to "Annotation" section of [Camelyon challenge](https://camelyon17.grand-challenge.org/Data/) to prepare ground truth images, which are needed for FROC computation. By default, this data set is expected to be at `/workspace/data/medical/pathology/ground_truths`. But it can be modified in `evaluate_froc.sh`.
30
 
 
32
 
33
  The training was performed with the following:
34
 
35
+ - Config file: train.config
36
  - GPU: at least 16 GB of GPU memory.
37
  - Actual Model Input: 224 x 224 x 3
38
  - AMP: True
39
  - Optimizer: Novograd
40
  - Learning Rate: 1e-3
41
  - Loss: BCEWithLogitsLoss
42
+ - Whole slide image reader: cuCIM (if running on Windows or Mac, please install `OpenSlide` on your system and change `wsi_reader` to "OpenSlide")
43
 
44
  ## Input
45
 
 
98
 
99
  TorchScript conversion is currently not supported.
100
 
 
 
 
 
 
 
 
 
 
 
101
  # References
102
 
103
  [1] He, Kaiming, et al, "Deep Residual Learning for Image Recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016. <https://arxiv.org/pdf/1512.03385.pdf>
104
 
105
  # License
106
+
107
  Copyright (c) MONAI Consortium
108
 
109
  Licensed under the Apache License, Version 2.0 (the "License");
scripts/evaluate_froc.sh ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+
3
+ LEVEL=6
4
+ SPACING=0.243
5
+ READER=openslide
6
+ EVAL_DIR=../eval
7
+ GROUND_TRUTH_DIR=/workspace/data/medical/pathology/ground_truths
8
+
9
+ echo "=> Level= ${LEVEL}"
10
+ echo "=> Spacing = ${SPACING}"
11
+ echo "=> WSI Reader: ${READER}"
12
+ echo "=> Evaluation output directory: ${EVAL_DIR}"
13
+ echo "=> Ground truth directory: ${GROUND_TRUTH_DIR}"
14
+
15
+ python3 ./lesion_froc.py \
16
+ --level $LEVEL \
17
+ --spacing $SPACING \
18
+ --reader $READER \
19
+ --eval-dir ${EVAL_DIR} \
20
+ --ground-truth-dir ${GROUND_TRUTH_DIR}
scripts/lesion_froc.py ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import os
3
+
4
+ from monai.apps.pathology import LesionFROC
5
+
6
+
7
+ def full_path(dir: str, file: str):
8
+ return os.path.normpath(os.path.join(dir, file))
9
+
10
+
11
+ def load_data(ground_truth_dir: str, eval_dir: str, level: int, spacing: float):
12
+ # Get the list of probability map result files
13
+ prob_files = os.listdir(eval_dir)
14
+
15
+ # read the dataset and create an eval_dataset based on that.
16
+ eval_dataset = []
17
+ for prob_name in prob_files:
18
+ if prob_name.endswith(".npy"):
19
+ sample = {
20
+ "tumor_mask": full_path(ground_truth_dir, prob_name.replace("npy", "tif")),
21
+ "prob_map": full_path(eval_dir, prob_name),
22
+ "level": level,
23
+ "pixel_spacing": spacing,
24
+ }
25
+
26
+ eval_dataset.append(sample)
27
+
28
+ return eval_dataset
29
+
30
+
31
+ def evaluate_froc(data, reader):
32
+ lesion_froc = LesionFROC(data, image_reader_name=reader)
33
+ score = lesion_froc.evaluate()
34
+ return score
35
+
36
+
37
+ if __name__ == "__main__":
38
+ # Parse command line arguments
39
+ parser = argparse.ArgumentParser()
40
+ parser.add_argument("-s", "--spacing", type=float, default=0.243, dest="spacing")
41
+ parser.add_argument("-l", "--level", type=int, default=6, dest="level")
42
+ parser.add_argument("-r", "--reader", type=str, default="cucim", dest="reader")
43
+ parser.add_argument("-e", "--eval-dir", type=str, dest="eval_dir")
44
+ parser.add_argument("-g", "--ground-truth-dir", type=str, dest="ground_truth_dir")
45
+ args = parser.parse_args()
46
+
47
+ # prepare FROC input data
48
+ data = load_data(args.ground_truth_dir, args.eval_dir, args.level, args.spacing)
49
+ if len(data) < 1:
50
+ raise RuntimeError(f"No probability map result found in '{args.eval_dir}' with '.npy' extension.")
51
+
52
+ # evaluate FROC
53
+ score = evaluate_froc(data, args.reader)
54
+ with open(full_path(args.eval_dir, "froc_score.txt"), "w") as f:
55
+ f.write(f"FROC Score: {score}\n")
56
+ print(f"FROC Score: {score}")