Spaces:
Configuration error
Configuration error
File size: 5,762 Bytes
3f31c34 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## EfficientSAM Adapter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### A Quick Overview\n",
"\n",
"<img width=\"880\" height=\"380\" src=\"../figs/EfficientSAM/EfficientSAM.png\">\n",
"\n",
"The huge computation cost of [SAM](https://github.com/facebookresearch/segment-anything) model has limited its applications to wider real-world applications. To address this limitation, [EfficientSAMs](https://github.com/yformer/EfficientSAM) provide lightweight SAM models that exhibits decent performance with largely reduced complexity. This is based on leveraging SAM-leveraged masked image pertraining (SAMI), which learns to reconstruct features from SAM image encoder for effective visual representation learning.\n",
"\n",
"Our goal is to integrate medical specific domain knowledge into the lightweight EfficientSAM model through adaptation technique. Therefore, we only utilize the pre-trained EfficientSAM weights without reperforming the SAMI process. Like our original [Medical SAM Adapter](https://arxiv.org/abs/2304.12620), we achieve efficient migration from the original SAM to medical images by adding simple Adapters in the network."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training\n",
"We have unified the interfaces of EfficientSAM and SAM, and training EfficientSAM Adapter can be done using the same command as the SAM adapter:\n",
"``python train.py -net efficient_sam -data_path data/isic -sam_ckpt checkpoint/efficient_sam/efficient_sam_vits.pt -image_size 1024 -vis 100 -mod sam_adpt``\n",
"\n",
"The pretrained weight of EfficientSAM can be downloaded here:\n",
"| EfficientSAM-S | EfficientSAM-Ti |\n",
"|------------------------------|------------------------------|\n",
"| [Download](https://github.com/yformer/EfficientSAM/blob/main/weights/efficient_sam_vits.pt.zip) |[Download](https://github.com/yformer/EfficientSAM/blob/main/weights/efficient_sam_vitt.pt)|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Performance VS SAM \n",
"**Setup**: Using a single Nvidia RTX 3090 GPU, the batch_size was set to 2. The resolution of input image is 1024.\n",
"\n",
"#### ISIC\n",
"| Baseline | Backbone | DICE | mIou | Memory |\n",
"| ------------ | --------- | ------ | ---- | ------- |\n",
"| SAM | VIT-Base | 0.9225 | 0.8646 | 22427 M |\n",
"| EfficientSAM | VIT-Small | 0.9091 | 0.8463 | 21275 M | \n",
"| EfficientSAM | VIT-Tiny | 0.9095 | 0.8437 | 15713 M |\n",
"\n",
"#### REFUGE\n",
"| Baseline | Backbone | DICE | mIou | Memory |\n",
"| ------------ | --------- | ------ | ---- | ------- |\n",
"| SAM | VIT-Base | 0.9085 | 0.8423 | 22427 M |\n",
"| EfficientSAM | VIT-Small | 0.8691 | 0.7915 | 21275 M | \n",
"| EfficientSAM | VIT-Tiny | 0.7999 | 0.6949 | 15713 M |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Performance Under different resolution \n",
"**Setup**: Using a single Nvidia RTX 3090 GPU, the batch_size was set to 2. Using VIT-small as the backbone.\n",
"\n",
"#### ISIC\n",
"| Resolution | DICE | mIou | Memory |\n",
"| --------- | ------ | ---- | ------- |\n",
"| 1024 | 0.9091 | 0.8463 | 22427 M |\n",
"| 512 | 0.9134 | 0.8495 | 21275 M | \n",
"| 256 | 0.9080 | 0.8419 | 15713 M |\n",
"\n",
"#### REFUGE\n",
"| Resolution | DICE | mIou | Memory |\n",
"| --------- | ------ | ---- | ------- |\n",
"| 1024 | 0.8691 | 0.7915 | 22427 M |\n",
"| 512 | 0.8564 | 0.7673 | 21275 M | \n",
"| 256 | 0.7432 | 0.6244 | 15713 M |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The decreasing curve of loss and the performance curve.\n",
"#### Backbone: VIT-Small\n",
"<p float=\"left\">\n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-S (ISIC)_loss.png\" width=\"200\" />\n",
" \n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-S (ISIC)_performance.png\" width=\"200\" /> \n",
" \n",
"</p>\n",
"<p float=\"left\">\n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-S (REFUGE)_loss.png\" width=\"200\" /> \n",
" \n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-S (REFUGE)_performance.png\" width=\"200\" /> \n",
"</p>\n",
"\n",
"#### Backbone: VIT-Tiny\n",
"<p float=\"left\">\n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-Ti (ISIC)_loss.png\" width=\"200\" />\n",
" \n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-Ti (ISIC)_performance.png\" width=\"200\" /> \n",
"</p>\n",
"\n",
"<p float=\"left\">\n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-Ti (REFUGE)_loss.png\" width=\"200\" /> \n",
" \n",
" <img src=\"../figs/EfficientSAM/EfficientSAM-Ti (REFUGE)_performance.png\" width=\"200\" /> \n",
"</p>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.7.16 ('general')",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.7.16"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "e7f99538a81e8449c1b1a4a7141984025c678b5d9c33981aa2a3c129d8e1c90d"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
|