Commited model weights and demo code

Browse files

Files changed (10) hide show

LICENSE.md +9 -0
README.md +51 -0
autoencoders_demo.ipynb +0 -0
config.json +28 -0
data_preprocessing_recipe.py +202 -0
data_utils.py +24 -0
diffusion_pytorch_model.safetensors +3 -0
example_data/mri_complex_images.npz +3 -0
inference.py +28 -0
metrics.py +30 -0

LICENSE.md ADDED Viewed

	@@ -0,0 +1,9 @@

+MIT License
+Copyright (c) Microsoft Corporation
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+license: mit
+language:
+- en
+library_name: diffusers
+tags:
+- MRI
+- medical-imaging
+- VAE
+- autoencoder
+---
+# MRI Autoencoder v0.1
+## Model
+MRI autoencoder is a Variational Autoencoder (VAE) trained on the fast MRI multi-coil brain and knee datasets. The model is trained from scratch and uses the same architecture as the Stable Diffusion SDXL VAE model.
+Latent Diffusion Models (LDMs) have been extremely popular in synthesizing images and videos. However, they remain relatively under-explored in the field of medical imaging. One possible reason is the lack of domain specific autoencoders that can encode and decode higher dimensional medical imaging data to their lower dimensional latent representation. MRI images, for example, are different than general domain images in that they are complex valued with magnitude and phase information. To this end, we are publishing an autoencoder that can be used to encode and decode complex valued MRI images to and from their latent representation.
+## Use
+```
+    from diffusers.models import AutoencoderKL
+    autoencoder = AutoencoderKL.from_pretrained("microsoft/mri-autoencoder-v0.1")
+```
+For more details please refer to the provided autoencoders_demo notebook. For details on how the fastmri data was preprocessed, please refer to data_preprocessing_recipe.py.
+## Intended Use
+The model is intended to be used solely for future research in medical imaging. Stakeholders would benefit by treating this model as a building block towards exploring latent space generative models applied to complex valued MRI images.
+## Out-of-Scope Use
+Any deployed use case of the model, commercial or otherwise, is out of scope. The model weights and code are not intended for clinical use.
+## Evaluation
+The PSNR and SSIM scores on randomly chosen 8000 slices from the fastMRI multicoil validation dataset are as follows:
+| Autoencoder | Median PSNR | Mean PSNR | PSNR 95% CI | Median SSIM | Mean SSIM | SSIM 95% CI |
+| ----------- | ----------- | --------- | ----------- | ----------- | --------- | ----------- |
+| MRI-AUTOENCODER-v0.1 | 34.31 | 33.98 | (28.55. 37.79) | 0.91 | 0.88 | (0.54, 0.97) |
+| SDXL-VAE | 31.45 | 31.51 | (27.85, 35.63) | 0.89 | 0.86 | (0.58, 0.94) |
+## Data
+This model was trained, with permission, using the NYU fastMRI Dataset (https://fastmri.med.nyu.edu/), which is a deidentified imaging dataset provided by NYU Langone comprised of raw k-space data in several sub-dataset groups.
+## Limitations
+A model trained on this dataset might likely overfit and not generalize well to new data. This model has not been evaluated for clinical use or across a range of scanner types.

autoencoders_demo.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

config.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "_class_name": "AutoencoderKL",
+  "_diffusers_version": "0.24.0",
+  "act_fn": "silu",
+  "block_out_channels": [
+    128,
+    256,
+    512
+  ],
+  "down_block_types": [
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D"
+  ],
+  "force_upcast": true,
+  "in_channels": 2,
+  "latent_channels": 4,
+  "layers_per_block": 2,
+  "norm_num_groups": 32,
+  "out_channels": 2,
+  "sample_size": 256,
+  "scaling_factor": 0.18215,
+  "up_block_types": [
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D"
+  ]
+}

data_preprocessing_recipe.py ADDED Viewed

	@@ -0,0 +1,202 @@

+''' This file contains the recipe for data preprocessing used to generate the combined coil images from the fastmri multicoil brain and knee datasets.
+    These combined coil images were then used to train the autoencoder. The combined coil images are generated by combining the coil images using
+    the sensitivity maps calculated with bart. To run this recipe, the bart toolbox needs to be installed and then follow the steps outlined in
+    the preprocess_recipe function.'''
+# bart toolbox installation instructions - https://mrirecon.github.io/bart/installation.html
+_BART_TOOLBOX_PATH = ''
+import numpy as np
+import h5py
+from tqdm import tqdm
+import sys, os
+os.environ["TOOLBOX_PATH"] = _BART_TOOLBOX_PATH
+sys.path.append(os.path.join(_BART_TOOLBOX_PATH, 'python'))
+from bart import bart
+os.environ["OMP_NUM_THREADS"] = "1"
+def fftc(input,  axes=None, norm='ortho'):
+    """
+    Perform a Fast Fourier Transform on the input array.
+    Parameters:
+    input (numpy.ndarray): The input array to transform.
+    axes (tuple, optional): Axes over which to compute the FFT. If not specified, compute over all axes.
+    norm (str, optional): Normalization mode. Default is 'ortho' for orthonormal transform.
+    Returns:
+    numpy.ndarray: The transformed output array.
+    """
+    tmp = np.fft.ifftshift(input, axes=axes)
+    tmp = np.fft.fftn(tmp, axes=axes, norm=norm)
+    output = np.fft.fftshift(tmp, axes=axes)
+    return output
+def ifftc(input, axes=None, norm='ortho'):
+    """
+    Perform an Inverse Fast Fourier Transform on the input array.
+    Parameters:
+    input (numpy.ndarray): The input array to transform.
+    axes (tuple, optional): Axes over which to compute the inverse FFT. If not specified, compute over all axes.
+    norm (str, optional): Normalization mode. Default is 'ortho' for orthonormal transform.
+    Returns:
+    numpy.ndarray: The transformed output array.
+    """
+    tmp = np.fft.ifftshift(input, axes=axes)
+    tmp = np.fft.ifftn(tmp, axes=axes, norm=norm)
+    output = np.fft.fftshift(tmp, axes=axes)
+    return output
+def adjoint(ksp, maps, mask):
+    """
+    Perform the adjoint operation on k-space data with coil sensitivity maps and a mask.
+    Parameters:
+    ksp (numpy.ndarray): The input k-space data, shape: [1, C, H, W].
+    maps (numpy.ndarray): The coil sensitivity maps, shape: [1, C, H, W].
+    mask (numpy.ndarray): The mask to apply on the k-space data, shape: [1, 1, H, W].
+    Returns:
+    numpy.ndarray: The output image after applying the adjoint operation, shape: [1, 1, H, W].
+    """
+    masked_ksp = ksp*mask
+    coil_imgs = ifftc(masked_ksp,axes=(-2,-1))
+    img_out = np.sum(coil_imgs*np.conj(maps),axis=1)[:,None,...]
+    return img_out
+def _expand_shapes(*shapes):
+    """
+    Expand the dimensions of the given shapes to match the maximum dimension.
+    This function prepends 1s to the shapes with fewer dimensions to match the maximum number of dimensions.
+    Parameters:
+    *shapes (tuple): A variable length tuple containing shapes (as lists or tuples of integers).
+    Returns:
+    tuple: A tuple of expanded shapes, where each shape is a list of integers.
+    """
+    shapes = [list(shape) for shape in shapes]
+    max_ndim = max(len(shape) for shape in shapes)
+    shapes_exp = [[1] * (max_ndim - len(shape)) + shape
+                  for shape in shapes]
+    return tuple(shapes_exp)
+def resize(input, oshape, ishift=None, oshift=None):
+    """
+    Resize with zero-padding or cropping.
+    Parameters:
+    input (array): Input array.
+    oshape (tuple of ints): Output shape.
+    ishift (None or tuple of ints): Input shift.
+    oshift (None or tuple of ints): Output shift.
+    Returns:
+    array: Zero-padded or cropped result.
+    """
+    ishape1, oshape1 = _expand_shapes(input.shape, oshape)
+    if ishape1 == oshape1:
+        return input.reshape(oshape)
+    if ishift is None:
+        ishift = [max(i // 2 - o // 2, 0) for i, o in zip(ishape1, oshape1)]
+    if oshift is None:
+        oshift = [max(o // 2 - i // 2, 0) for i, o in zip(ishape1, oshape1)]
+    copy_shape = [min(i - si, o - so)
+                  for i, si, o, so in zip(ishape1, ishift, oshape1, oshift)]
+    islice = tuple([slice(si, si + c) for si, c in zip(ishift, copy_shape)])
+    oslice = tuple([slice(so, so + c) for so, c in zip(oshift, copy_shape)])
+    output = np.zeros(oshape1, dtype=input.dtype)
+    input = input.reshape(ishape1)
+    output[oslice] = input[islice]
+    return output.reshape(oshape)
+def shape_data(ksp, final_res):
+    """
+    Reshape coil k-space data to output coil images with isotropic pixels and correct FOV = origional image width and the correct square image size given by "final_res".
+    This function assumes that the k-space data has already been padded to make the corresponding images have isotropic pixels.
+    Parameters:
+    ksp (numpy.ndarray): The input coil k-space data, shape: [S, C, H, W].
+    final_res (int): The final resolution for the output image.
+    Returns:
+    numpy.ndarray: The output image after reshaping, shape: [S, C, final_res, final_res].
+    """
+    H = ksp.shape[-2]
+    W = ksp.shape[-1]
+    S = ksp.shape[0]
+    C = ksp.shape[1]
+    # bring the coil ksp into coil image space
+    img1 = ifftc(ksp,axes=(-2,-1))
+    img1_cropped = resize(img1, oshape=(S,C,W,W))
+    # FOV is now the same in both directions without modifying the resolution
+    ksp1 = fftc(img1_cropped,axes=(-2,-1))
+    # crop or pad the ksp isotropically in fourier space to the correct image size while mainting the same field of view (in width direction) in the original image
+    ksp1_cropped = resize(ksp1, oshape=(S,C,final_res,final_res))
+    img_out = ifftc(ksp1_cropped,axes=(-2,-1))
+    return img_out
+def read_fastmri_data(file_path):
+    """
+    This function reads k-space data from a .h5 file.
+    Parameters:
+    file_path (str): The path to the .h5 file containing FastMRI data.
+    Returns:
+    numpy.ndarray: The k-space data as a numpy array.
+    """
+    hf = h5py.File(file_path, 'r')
+    ksp = np.asarray(hf['kspace'])
+    return ksp
+def combine_coils(ksp):
+    """
+    Combine multi-coil k-space data into a single coil image.
+    This function reshapes the raw multi-coil k-space data, calculates sensitivity maps for the reshaped data using the BART tool's 'ecalib' command, and then uses these maps to create a single coil image via a fully sampled adjoint operation.
+    Parameters:
+    ksp (numpy.ndarray): The input multi-coil k-space data, shape: [B, C, H, W].
+    Returns:
+    numpy.ndarray: The output single coil image, shape: [B, 1, H, W].
+    """
+    # reshape raw multi-coil kspace to desired shape (ex [B,C,256,256])
+    coil_img_rs = shape_data(ksp, final_res=256)
+    coil_ksp_rs = fftc(coil_img_rs, axes=(-2,-1))
+    # calculate sensitivity maps for reshaped coil ksp
+    ksp_rs = coil_ksp_rs.transpose((2,3,0,1))
+    maps = np.array(ksp_rs)
+    #calculate Espirit maps with bart
+    for j in tqdm(range(ksp_rs.shape[2])):
+        sens = bart(1,'ecalib -m1 -W -c0', ksp_rs[:,:,j,None,:])#requires data of the form (Row,Column,None,Coil)<-output of ecalib too, this should then be saved (slice, coil, rows, columns)
+        maps[:,:,j,:] = sens[:,:,0,:]
+    maps_rs = maps.transpose((2,3,0,1))
+    # use new maps to create single coil image via fully sampled adjoint operation
+    single_coil_rs_img = adjoint(ksp=coil_ksp_rs, maps = maps_rs, mask = np.ones_like(coil_ksp_rs))
+    return single_coil_rs_img
+def preprocess_data_recipe():
+    # for each file in the fastMRI dataset
+    # call read_fastmri_data to get the kspace data
+    # call combine_coils to create the combined coil image
+    pass

data_utils.py ADDED Viewed

	@@ -0,0 +1,24 @@

+import numpy as np
+def complex_to_two_channel_image(complex_img: np.ndarray) -> np.ndarray:
+    """Converts a complex valued image to a 2 channel image (real and imaginary channels)"""
+    real, imag = np.real(complex_img), np.imag(complex_img)
+    return np.concatenate((real, imag), axis=0)
+def two_channel_to_complex_image(two_ch_img: np.ndarray) -> np.ndarray:
+    """Converts a 2 channel image (real and imaginary channels) to a complex valued image"""
+    two_ch_img = two_ch_img[0]
+    real = two_ch_img[0]
+    imag = two_ch_img[1]
+    complex_image = real + 1j*imag
+    return complex_image[None,...]
+def normalize_complex_coil_image(complex_coil_img: np.ndarray) -> np.ndarray:
+    """Scales the complex valued coil image """
+    max_val = np.percentile(np.abs(complex_coil_img), 99.5)
+    return complex_coil_img / max_val
+def create_three_channel_image(complex_coil_img: np.ndarray) -> np.ndarray:
+    """Converts a complex valued coil image to a 3 channel image (magnitude channels repated 3 times)"""
+    mag = np.abs(complex_coil_img)
+    return np.concatenate((mag, mag, mag), axis=0)

diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4b07cfaad692d5e60669b5bfe0432de71eb11ad6925bf9e0bd333b69d15c5e62
+size 221317280

example_data/mri_complex_images.npz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:69a6b217303b2147957ac2f1dbcee976b5ed509621acd94ca55110a1c8f02e5c
+size 2022432

inference.py ADDED Viewed

	@@ -0,0 +1,28 @@

+import torch
+import data_utils as du
+def run_inference_two_channels(coil_complex_image, autoencoder, device="cuda"):
+    coil_complex_image = du.normalize_complex_coil_image(coil_complex_image)
+    two_channel_image = du.complex_to_two_channel_image(coil_complex_image)
+    two_channel_tensor = torch.from_numpy(two_channel_image)[None,...].type(torch.FloatTensor).to(device)
+    autoencoder = autoencoder.to(device)
+    with torch.no_grad():
+        autoencoder_output = autoencoder.encode(two_channel_tensor)
+        latents = autoencoder_output.latent_dist.mean
+        decoded_image = autoencoder.decode(latents).sample
+    recon = du.two_channel_to_complex_image(decoded_image.detach().cpu().numpy())
+    input = coil_complex_image
+    return input, recon
+def run_inference_three_channels(coil_complex_image, autoencoder, device="cuda"):
+    coil_complex_image = du.normalize_complex_coil_image(coil_complex_image)
+    three_channel_image = du.create_three_channel_image(coil_complex_image)
+    three_channel_tensor = torch.from_numpy(three_channel_image)[None,...].type(torch.FloatTensor).to(device)
+    autoencoder = autoencoder.to(device)
+    with torch.no_grad():
+        autoencoder_output = autoencoder.encode(three_channel_tensor)
+        latents = autoencoder_output.latent_dist.mean
+        decoded_image = autoencoder.decode(latents).sample
+    recon = decoded_image[0].detach().cpu().numpy()
+    input = three_channel_image
+    return input, recon

metrics.py ADDED Viewed

	@@ -0,0 +1,30 @@

+import numpy as np
+from skimage.metrics import peak_signal_noise_ratio, structural_similarity
+from typing import Optional
+def ssim(
+    gt: np.ndarray, pred: np.ndarray, data_range: Optional[float] = None
+) -> np.ndarray:
+    """Compute Structural Similarity Index Metric (SSIM)"""
+    if not gt.ndim == 3:
+        raise ValueError("Unexpected number of dimensions in ground truth.")
+    if not gt.ndim == pred.ndim:
+        raise ValueError("Ground truth dimensions does not match pred.")
+    data_range = gt.max() if data_range is None else data_range
+    ssim = np.array([0])
+    for slice_num in range(gt.shape[0]):
+        ssim = ssim + structural_similarity(
+            gt[slice_num], pred[slice_num], data_range=data_range
+        )
+    return ssim / gt.shape[0]
+def psnr(
+    gt: np.ndarray, pred: np.ndarray, data_range: Optional[float] = None
+) -> np.ndarray:
+    """Compute Peak Signal to Noise Ratio metric (PSNR)"""
+    data_range = gt.max() if data_range is None else data_range
+    return peak_signal_noise_ratio(gt, pred, data_range=data_range)