LoneStriker
/

Air-Striker-Mixtral-8x7B-Instruct-ZLoss-GGUF

Text Generation

Model card Files Files and versions Community

Edit model card

Air-Striker-Mixtral-8x7B-Instruct-ZLoss

Experimental model, trained using config and Transformers/Axolotl forks provided by Doctor-Shotgun

Model was fine-tuned from Mixtral-8x7B-v0.1 with airoboros-3.2 dataset, for 4 epochs, ChatML prompt format at 8K context length.

Additionally, model was then merged with Mixtral-8x7B-Instruct-v0.1:

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

mistralai/Mixtral-8x7B-Instruct-v0.1
LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mixtral-8x7B-Instruct-v0.1
    parameters:
      weight: 0.5
  - model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
    parameters:
      weight: 0.5
merge_method: linear
dtype: bfloat16

Downloads last month: 46

GGUF

Model size

46.7B params

Architecture

llama

2-bit

3-bit

4-bit

5-bit

6-bit

Inference Examples

Text Generation

Inference API (serverless) has been turned off for this model.

Dataset used to train LoneStriker/Air-Striker-Mixtral-8x7B-Instruct-ZLoss-GGUF