tags:
- vision
- image-segmentation
datasets:
- segments/sidewalk-semantic
finetuned_from:
- nvidia/mit-b5
widget:
- src: >-
https://segmentsai-prod.s3.eu-west-2.amazonaws.com/assets/admin-tobias/439f6843-80c5-47ce-9b17-0b2a1d54dbeb.jpg
example_title: Brugge
SegFormer (b5-sized) model fine-tuned on sidewalk-semantic dataset.
SegFormer model fine-tuned on SegmentsAI sidewalk-semantic
. It was introduced in the paper SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Xie et al. and first released in this repository.
Model description
SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset.
Code and Notebook
Here is how to use this model to classify an image of the sidewalk dataset:
from transformers import SegformerFeatureExtractor, SegformerForImageClassification
from PIL import Image
import requests
url = "https://segmentsai-prod.s3.eu-west-2.amazonaws.com/assets/admin-tobias/439f6843-80c5-47ce-9b17-0b2a1d54dbeb.jpg"
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = SegformerFeatureExtractor.from_pretrained("zoheb/mit-b5-finetuned-sidewalk-semantic")
model = SegformerForImageClassification.from_pretrained("zoheb/mit-b5-finetuned-sidewalk-semantic")
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
# model predicts one of the 35 Sidewalk classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])
You can go through its detailed notebook here.
For more code examples, refer to the documentation.
License
The license for this model can be found here.
BibTeX entry and citation info
@article{DBLP:journals/corr/abs-2105-15203,
author = {Enze Xie and
Wenhai Wang and
Zhiding Yu and
Anima Anandkumar and
Jose M. Alvarez and
Ping Luo},
title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers},
journal = {CoRR},
volume = {abs/2105.15203},
year = {2021},
url = {https://arxiv.org/abs/2105.15203},
eprinttype = {arXiv},
eprint = {2105.15203},
timestamp = {Wed, 02 Jun 2021 11:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2105-15203.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}