p1atdev
/

wd-swinv2-tagger-v3-hf

Image Classification

Model card Files Files and versions Community

wd-swinv2-tagger-v3-hf / README.md

p1atdev's picture

Update README.md

b09fdef verified 8 months ago

|

history blame contribute delete

3.21 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: SmilingWolf/wd-swinv2-tagger-v3
	inference: false
	tags:
	- wd-tagger
	- optimum
	---

	# WD SwinV2 Tagger v3 with 🤗 transformers

	Converted from [SmilingWolf/wd-swinv2-tagger-v3](https://huggingface.co/SmilingWolf/wd-swinv2-tagger-v3) to transformers library format.

	## Example

	[![](https://camo.githubusercontent.com/f5e0d0538a9c2972b5d413e0ace04cecd8efd828d133133933dfffec282a4e1b/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/gist/p1atdev/d420d9fcd5c8ea66d9e10918fc330741/wd-swinv2-tagger-v3-hf-pipe.ipynb)

	### Installation

	```bash
	pip install transformers
	```

	### Pipeline

	```py
	from transformers import pipeline

	pipe = pipeline(
	"image-classification",
	model="p1atdev/wd-swinv2-tagger-v3-hf",
	trust_remote_code=True,
	)

	print(pipe("sample.webp", top_k=15))
	#[{'label': '1girl', 'score': 0.9973934888839722},
	# {'label': 'solo', 'score': 0.9719744324684143},
	# {'label': 'dress', 'score': 0.9539461135864258},
	# {'label': 'hat', 'score': 0.9511678218841553},
	# {'label': 'outdoors', 'score': 0.9438753128051758},
	# ...
	```


	### AutoModel


	```py
	from PIL import Image

	import numpy as np
	import torch

	from transformers import (
	AutoImageProcessor,
	AutoModelForImageClassification,
	)

	MODEL_NAME = "p1atdev/wd-swinv2-tagger-v3-hf"

	model = AutoModelForImageClassification.from_pretrained(
	MODEL_NAME,
	)
	processor = AutoImageProcessor.from_pretrained(MODEL_NAME, trust_remote_code=True)

	image = Image.open("sample.webp")
	inputs = processor.preprocess(image, return_tensors="pt")

	with torch.no_grad():
	outputs = model(**inputs.to(model.device, model.dtype))
	logits = torch.sigmoid(outputs.logits[0]) # take the first logits

	# get probabilities
	results = {model.config.id2label[i]: logit.float() for i, logit in enumerate(logits)}
	results = {
	k: v for k, v in sorted(results.items(), key=lambda item: item[1], reverse=True) if v > 0.35 # 35% threshold
	}
	print(results) # rating tags and character tags are also included
	#{'1girl': tensor(0.9974),
	# 'solo': tensor(0.9720),
	# 'dress': tensor(0.9539),
	# 'hat': tensor(0.9512),
	# 'outdoors': tensor(0.9439),
	# ...
	```

	### Accelerate with 🤗 Optimum

	Maybe about 30% faster and about 50% light weight model size than transformers version, but the accuracy is slightly degraded.

	```bash
	pip install optimum[onnxruntime]
	```

	```diff
	-from transformers import pipeline
	+from optimum.pipelines import pipeline

	pipe = pipeline(
	"image-classification",
	model="p1atdev/wd-swinv2-tagger-v3-hf",
	trust_remote_code=True,
	)

	print(pipe("sample.webp", top_k=15))
	#[{'label': '1girl', 'score': 0.9966088533401489},
	# {'label': 'solo', 'score': 0.9740601778030396},
	# {'label': 'dress', 'score': 0.9618403911590576},
	# {'label': 'hat', 'score': 0.9563733339309692},
	# {'label': 'outdoors', 'score': 0.945336639881134},
	# ...
	```


	## Labels

	All of rating tags have prefix `rating:` and character tags have prefix `character:`.

	- Rating tags: `rating:general`, `rating:sensitive`, ...
	- Character tags: `character:frieren`, `character:hatsune miku`, ...