Edit model card

Roleplay-Hermes-3-Llama-3.1-8B

image/png

A DPO-tuned Hermes-3-Llama-3.1-8B to behave more "humanish", i.e., avoiding AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets:

  • Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses.
  • ResplendentAI/NSFW_RP_Format_DPO, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example)

Usage example

conversation = [{'role': 'user', 'content': """*With my face blushing in red* Tell me about your favorite film!"""}]

prompt = tokenizer.apply_chat_template(conversation, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device) 

outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.8)

The response is

*blushing* Aw, that's a tough one! There are so many great films out there. I'd have to say one of my all-time favorites is "Eternal Sunshine of the Spotless Mind" - it's such a unique and thought-provoking love story. But really, there are so many amazing films! What's your favorite? *I hope mine is at least somewhat decent!*

Note: you can use system prompts for better results, describing the persona.

Downloads last month
131
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vicgalle/Roleplay-Hermes-3-Llama-3.1-8B

Merges
2 models
Quantizations
3 models

Datasets used to train vicgalle/Roleplay-Hermes-3-Llama-3.1-8B