Edit model card
Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

Exllamav2 quant (exl2 / 4.25 bpw) made with ExLlamaV2 v0.0.21

Other EXL2 quants:

Quant Model Size lm_head
2.2
3250 MB
6
2.5
3478 MB
6
3.0
3894 MB
6
3.5
4311 MB
6
3.75
4518 MB
6
4.0
4727 MB
6
4.25
4935 MB
6
5.0
5556 MB
6
6.0
6497 MB
8
6.5
6893 MB
8
8.0
8125 MB
8

Daredevil-8B

tl;dr: It looks like a successful merge

Daredevil-8B is a merge of the following models using LazyMergekit:

πŸ”Ž Applications

It is a highly functional censored model. You might want to add <end_of_turn> as an additional stop string.

⚑ Quantization

πŸ† Evaluation

Model Average AGIEval GPT4All TruthfulQA Bigbench
mlabonne/Daredevil-8B πŸ“„ 55.87 44.13 73.52 59.05 46.77
mlabonne/ChimeraLlama-3-8B πŸ“„ 51.58 39.12 71.81 52.4 42.98
meta-llama/Meta-Llama-3-8B-Instruct πŸ“„ 51.34 41.22 69.86 51.65 42.64
meta-llama/Meta-Llama-3-8B πŸ“„ 45.42 31.1 69.95 43.91 36.7

🌳 Model family tree

image/png

🧩 Configuration

models:
  - model: NousResearch/Meta-Llama-3-8B
    # No parameters necessary for base model
  - model: nbeerbower/llama-3-stella-8B
    parameters:
      density: 0.6
      weight: 0.16
  - model: Hastagaras/llama-3-8b-okay
    parameters:
      density: 0.56
      weight: 0.1
  - model: nbeerbower/llama-3-gutenberg-8B
    parameters:
      density: 0.6
      weight: 0.18
  - model: openchat/openchat-3.6-8b-20240522
    parameters:
      density: 0.56
      weight: 0.12
  - model: Kukedlc/NeuralLLaMa-3-8b-DT-v0.1
    parameters:
      density: 0.58
      weight: 0.18
  - model: cstr/llama3-8b-spaetzle-v20
    parameters:
      density: 0.56
      weight: 0.08
  - model: mlabonne/ChimeraLlama-3-8B-v3
    parameters:
      density: 0.56
      weight: 0.08
  - model: flammenai/Mahou-1.1-llama3-8B
    parameters:
      density: 0.55
      weight: 0.05
  - model: KingNish/KingNish-Llama3-8b
    parameters:
      density: 0.55
      weight: 0.05
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
dtype: bfloat16

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/Daredevil-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Zoyd/mlabonne_Daredevil-8B-4_25bpw_exl2