kubernetes-bad
/

character-slop-classifier

Model card Files Files and versions Community

character-slop-classifier / README.md

kubernetes-bad's picture

Update README.md

dba502d verified 2 months ago

|

3.09 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- microsoft/deberta-v3-base
	---
	# Slop Classifier for Roleplay Characters

	> This model can detect characters that are created using AI.

	Part of [CharGen](https://huggingface.co/kubernetes-bad/chargen-v2) project - it is used to detect and filter out low-effort, LLM-made characters intended for role playing.

	Slop refers to over-used phrases that models like GPT3.5 like to use very much and that do not add any value to the text. "Shivers down her spine", "enigma wrapped in mystery", "half-lidded eyes", etc. Classifier is trained on set of synthetic characters generated with GPT3.5 and GPT4, and a subset of CharGen dataset.

	## Usage

	```py
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch
	from litserve import LitAPI, LitServer

	MODEL_NAME = "kubernetes-bad/character-slop-classifier"

	class CHARLitAPI(LitAPI):
	def setup(self, device):
	self.tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
	self.model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)
	self.model.to(device)
	self.model.eval()

	def decode_request(self, request):
	if "text" in request:
	inputs = self.tokenizer(request["text"], return_tensors="pt", padding=True, truncation=True, max_length=512)
	elif "texts" in request:
	inputs = self.tokenizer(request["texts"], return_tensors="pt", padding=True, truncation=True, max_length=512)
	else:
	raise ValueError("Invalid request format. Expected 'text' or 'texts' field.")
	return inputs

	def predict(self, inputs):
	with torch.no_grad():
	inputs = {k: v.to(self.model.device) for k, v in inputs.items()}
	outputs = self.model(**inputs)
	return outputs.logits

	def encode_response(self, logits):
	probabilities = torch.nn.functional.softmax(logits, dim=-1)
	if probabilities.shape[0] == 1:
	response = {
	"positive": probabilities[:, 1].item(),
	"negative": probabilities[:, 0].item()
	}
	else:
	response = [
	{
	"positive": prob[1].item(),
	"negative": prob[0].item()
	}
	for prob in probabilities
	]
	return response


	if __name__ == "__main__":
	api = CHARLitAPI()
	server = LitServer(api, accelerator='cuda')
	server.run(port=9000)
	```

	```bash
	curl --location 'http://localhost:9000/predict' \
	--header 'Content-Type: application/json' \
	--data '{
	"text": "Hermione, the seductive intellectual enchantress, is the secret sin of Hogwarts. Beneath her seemingly innocent scholarly facade lies a tantalizing world of forbidden desires. In the hallowed halls of the wizarding world, she conceals her lewd nature from her peers, maintaining a pristine reputation as the most brilliant witch of her age."
	}'
	```
	Example response:
	```json
	{
	"positive": 0.9975564479827881,
	"negative": 0.0024435613304376602
	}
	```