Midnight-Miqu-70B-v1.5 - EXL2 3.5bpw

This is a 3.5bpw EXL2 quant of sophosympatheia/Midnight-Miqu-70B-v1.5

Details about the model and the merge info can be found at the above mode page.

Prompt Templates

Please see sophosympatheia/Midnight-Miqu-70B-v1.5 for Silly Tavern presets and templates.

Further details on prompting this model will also pop up under the model discussions

Tavern Card

Included is a Tavern format character card created by Midnight Miqu v1.5 for chat. The card was created using a character creator helper bot using a single prompt for the base card, another prompt asking for specific conversation examples and then asking it to provide a text to image portrait prompt. Being able to faithfully follow the character creator bot to create this card demonstrates a pretty high level of intelligence.

An ageless, charismatic archmage named Elrondel stands before you, captured mid-laugh in a dimly lit study. His tall, slender frame is adorned with midnight blue robes that billow around him, intricate silver runes glinting in the candlelight. His stark white hair flows down to his shoulders, framing a sharp face with piercing sapphire eyes that seem to look right through you. A knowing smile plays upon his lips, hinting at countless secrets hidden beneath his pointed goatee.

In one hand, he holds a smoldering pipe filled with exotic herbs, casting a mysterious haze around his form. The other hand rests upon Stormbreaker, a colossal staff that towers above most men, crackling with latent lightning energy. Various rings adorn his long, dexterous fingers, each holding its own arcane power. Around his neck hangs a dragon-eye amulet, the eye itself appearing to follow your every move with fierce scrutiny.

The room is cluttered with tomes and scrolls, floating in midair as if held by invisible hands, creating a maelstrom of knowledge. Behind him, a crystal ball reflects swirling images of distant lands and times, while a cauldron bubbles with unknown concoctions on the hearth of an ancient fireplace. The scene exudes an air of enigma and might, leaving you both awestruck and slightly intimidated in the presence of this legendary figure from Thaylonia.

Perplexity Scoring

Below are the perplexity scores for the EXL2 models. A lower score is better.

Quant Level	Perplexity Score
5.0	5.1226
4.5	5.1590
4.0	5.1772
3.5	5.3030
3.0	5.4156
2.75	5.8717
2.5	5.7236
2.25	6.4102

EQ Bench

Here are the EQ Bench scores for the EXL2 quants using Alpaca, ChatML, Mistral, Vicuna-v1.1 and Vicuna-v0 prompt templates. A higher score is better.

Quant Size	Alpaca	ChatML	Mistral	Vicuna-v0	Vicuna-v1.1
5.0	77.38	76.25	77.67	78.83	77.86
4.5	75.45	74.76	76.06	76.39	76.28
4.0	77.99	75.25	77.18	76.84	76.08
3.5	73.47	71.83	72.6	72.0	74.77
3.0	71.46	70.33	71.06	72.75	72.21
2.75	76.41	72.76	75.99	76.06	77.19
2.5	74.61	74.78	75.58	74.2	75.55
2.25	72.76	71.28	72.89	72.81	71.91

Perplexity Script

This was the script used for perplexity testing.

#!/bin/bash

# Activate the conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate exllamav2

# Set the model name and bit size
MODEL_NAME="miqu-1-70b-sf"
BIT_PRECISIONS=(5.0 4.5 4.0 3.5 3.0 2.75 2.5 2.25)

# Print the markdown table header
echo "| Quant Level | Perplexity Score |"
echo "|-------------|------------------|"

for BIT_PRECISION in "${BIT_PRECISIONS[@]}"
do
  MODEL_DIR="models/${MODEL_NAME}_exl2_${BIT_PRECISION}bpw"
  if [ -d "$MODEL_DIR" ]; then
    output=$(python test_inference.py -m "$MODEL_DIR" -gs 22,24 -ed data/wikitext/wikitext-2-v1.parquet)
    score=$(echo "$output" | grep -oP 'Evaluation perplexity: \K[\d.]+')
    echo "| $BIT_PRECISION | $score |"
  fi
done

Quant Details

This is the script used for quantization.

#!/bin/bash

# Activate the conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate exllamav2

# Define variables
MODEL_DIR="models/Midnight-Miqu-70B-v1.5"
OUTPUT_DIR="exl2_midnightv15-70b"
MEASUREMENT_FILE="measurements/midnight70b-v15.json"
TEMPLATE_FILE="models/Midnight-Miqu-70B-v1.5/README-TEMPLATE.md"
CARD_FILE="models/Midnight-Miqu-70B-v1.5/Elrondel.card.png"

BIT_PRECISIONS=(6.0 5.0 4.5 4.0 3.5 3.0 2.75 2.5 2.25)

for BIT_PRECISION in "${BIT_PRECISIONS[@]}"
do
    CONVERTED_FOLDER="models/Midnight-Miqu-70B-v1.5_exl2_${BIT_PRECISION}bpw"
    UPLOAD_FOLDER="Dracones/Midnight-Miqu-70B-v1.5_exl2_${BIT_PRECISION}bpw"

    if [ -d "$CONVERTED_FOLDER" ]; then
        echo "Skipping $BIT_PRECISION as $CONVERTED_FOLDER already exists."
        continue
    fi

    rm -r "$OUTPUT_DIR"
    mkdir "$OUTPUT_DIR"
    mkdir "$CONVERTED_FOLDER"
    
    python convert.py -i "$MODEL_DIR" -o "$OUTPUT_DIR" -nr -m "$MEASUREMENT_FILE" -b "$BIT_PRECISION" -cf "$CONVERTED_FOLDER"

    cp "$TEMPLATE_FILE" "$CONVERTED_FOLDER/README.md"
    sed -i "s/X.XX/$BIT_PRECISION/g" "$CONVERTED_FOLDER/README.md"
    cp "$CARD_FILE" "$CONVERTED_FOLDER"

    /home/mmealman/miniconda3/bin/huggingface-cli upload "$UPLOAD_FOLDER" "$CONVERTED_FOLDER" .

done

Dracones
/

Midnight-Miqu-70B-v1.5_exl2_3.5bpw