LoneStriker's picture
Upload 3 files
f59a077 verified
|
raw
history blame
1.44 kB
metadata
inference: false
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - mixtral
  - mergekit
  - merge
license: apache-2.0
datasets:
  - jondurbin/airoboros-3.2

Air-Striker-Mixtral-8x7B-Instruct-ZLoss

Experimental model, trained using config and Transformers/Axolotl forks provided by Doctor-Shotgun

Model was fine-tuned from Mixtral-8x7B-v0.1 with airoboros-3.2 dataset, for 4 epochs, ChatML prompt format at 8K context length.

Additionally, model was then merged with Mixtral-8x7B-Instruct-v0.1:


This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

  • mistralai/Mixtral-8x7B-Instruct-v0.1
  • LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mixtral-8x7B-Instruct-v0.1
    parameters:
      weight: 0.5
  - model: LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss
    parameters:
      weight: 0.5
merge_method: linear
dtype: bfloat16