UperNet, Swin Transformer large-sized backbone
UperNet framework for semantic segmentation, leveraging a Swin Transformer backbone. UperNet was introduced in the paper Unified Perceptual Parsing for Scene Understanding by Xiao et al.
Combining UperNet with a Swin Transformer backbone was introduced in the paper Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.
Disclaimer: The team releasing UperNet + Swin Transformer did not write a model card for this model so this model card has been written by the Hugging Face team.
Model description
UperNet is a framework for semantic segmentation. It consists of several components, including a backbone, a Feature Pyramid Network (FPN) and a Pyramid Pooling Module (PPM).
Any visual backbone can be plugged into the UperNet framework. The framework predicts a semantic label per pixel.
Intended uses & limitations
You can use the raw model for semantic segmentation. See the model hub to look for fine-tuned versions (with various backbones) on a task that interests you.
How to use
For code examples, we refer to the documentation.
- Downloads last month
- 9,641