ashaduzzaman's picture
Update README.md
213b5dc verified
metadata
library_name: transformers
license: other
base_model: nvidia/mit-b0
tags:
  - generated_from_trainer
datasets:
  - scene_parse_150
model-index:
  - name: segformer-b0-scene-parse-150
    results: []
metrics:
  - mean_iou
pipeline_tag: image-segmentation

Segformer-b0-scene-parse-150

This model is a fine-tuned version of the nvidia/mit-b0 model, specifically trained on the scene_parse_150 dataset. The goal of this model is to perform semantic segmentation for various scene parsing tasks.

Evaluation Results:

The model achieved the following results on the evaluation dataset:

  • Loss: 1.8435
  • Mean IoU: 0.0881
  • Mean Accuracy: 0.1619
  • Overall Accuracy: 0.6663

Per-Category IoU and Per-Category Accuracy values are available but sparse, indicating performance variability across different categories.

Model Description

Segformer-b0 is based on a modified version of the Vision Transformer (ViT) architecture, adapted for efficient segmentation tasks. It incorporates hierarchical features to generate high-quality segmentation maps.

More detailed model descriptions, including architectural adjustments or preprocessing requirements, are needed.

Intended Uses & Limitations

  • Use Cases: Suitable for scene parsing and segmentation tasks in environments with diverse visual categories.
  • Limitations: Performance varies significantly between categories, as seen from sparse accuracy and IoU metrics. The model may struggle with underrepresented classes or categories with fewer visual distinctions.
  • Further details on intended domains and limitations are needed.

Training and Evaluation Data

The model was trained on the scene_parse_150 dataset, which consists of diverse visual scenes with 150 unique semantic categories. Further information on dataset specifics and any preprocessing steps is needed.

Training Procedure

Hyperparameters:

  • Learning Rate: 6e-05
  • Training Batch Size: 2
  • Evaluation Batch Size: 2
  • Seed: 42
  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
  • Learning Rate Scheduler: Linear
  • Number of Epochs: 50

Training Results:

The model was trained over 50 epochs, but further details regarding its convergence behavior, training duration, and hardware environment could provide additional insights.

Framework Versions:

  • Transformers 4.44.2
  • PyTorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1