---
tags:
- pytorch

---

# ViT Leukemia Classifier

## Model Description

This Vision Transformer (ViT) model is designed for the classification of leukemia images into one of four classes. It uses the pre-trained [Swin Transformer model](https://huggingface.co/microsoft/swin-base-patch4-window7-224) as the base and adds fully connected layers for classification. The model supports training, validation, and evaluation, and can upload the best performing model to the Hugging Face Hub. This model was developed by [Sebastian Sarasti](https://www.linkedin.com/in/sebastiansarasti/) for the Quito AI Day event.

## Model Architecture

The model consists of the following layers:

- Base Model: Swin Transformer (`microsoft/swin-base-patch4-window7-224`)
- Fully Connected Layer: 49 * 1024 input features, 100 output features
- ReLU Activation
- Fully Connected Layer: 100 input features, 4 output features

The base model's parameters are frozen during training.

## Dataset

The model was trained on the [Leukemia dataset from Kaggle](https://www.kaggle.com/datasets/mehradaria/leukemia), which consists of images labeled into different leukemia types.

## Usage

To use this model, you can load it from the Hugging Face Hub as follows:

```python
from transformers import AutoModel

model = AutoModel.from_pretrained("path/to/your/model")