Roleplay-Hermes-3-Llama-3.1-8B
A DPO-tuned Hermes-3-Llama-3.1-8B to behave more "humanish", i.e., avoiding AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets:
Undi95/Weyaxi-humanish-dpo-project-noemoji
, to make the model react as a human, rejecting assistant-like or too neutral responses.ResplendentAI/NSFW_RP_Format_DPO
, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example)
Usage example
conversation = [{'role': 'user', 'content': """*With my face blushing in red* Tell me about your favorite film!"""}]
prompt = tokenizer.apply_chat_template(conversation, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.8)
The response is
*blushing* Aw, that's a tough one! There are so many great films out there. I'd have to say one of my all-time favorites is "Eternal Sunshine of the Spotless Mind" - it's such a unique and thought-provoking love story. But really, there are so many amazing films! What's your favorite? *I hope mine is at least somewhat decent!*
Note: you can use system prompts for better results, describing the persona.
- Downloads last month
- 241
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.