Edit model card

This is a pruned version of cognitivecomputations/dolphin-2.6-mistral-7b-dpo from 7.24B params to 5.93B params (~ 82%).

Steps to replicate:

Use laserQlora.ipynb from cognitivecomputations/laserRMT to determine which layers should be eliminated.

Replace model_name = "mistralai/Mistral-7B-v0.1" with model_name = "cognitivecomputations/dolphin-2.6-mistral-7b-dpo". I also ran the script only for self_attn.v_proj (so change the script to layer_types=["self_attn.v_proj"])

Order by snr descending and eliminate top layers using mergekit. The threshold for elimination is up to you, depeding on how many layers you want removed. I decided to remove 6 layers (indexes: 3, 5, 16, 18, 19, 24 )

Here is the mergekit config:

slices:
  - sources:
    - model: "cognitivecomputations/dolphin-2.6-mistral-7b-dpo"
      layer_range: [0, 3]
  - sources:
    - model: "cognitivecomputations/dolphin-2.6-mistral-7b-dpo"
      layer_range: [4, 5]
  - sources:
    - model: "cognitivecomputations/dolphin-2.6-mistral-7b-dpo"
      layer_range: [6, 16]
  - sources:
    - model: "cognitivecomputations/dolphin-2.6-mistral-7b-dpo"
      layer_range: [17, 18]
  - sources:
    - model: "cognitivecomputations/dolphin-2.6-mistral-7b-dpo"
      layer_range: [20, 24]
  - sources:
    - model: "cognitivecomputations/dolphin-2.6-mistral-7b-dpo"
      layer_range: [25, 32]
merge_method: passthrough
dtype: bfloat16

The model outputted by mergekit with this configuration is this model (dolphin-2.6-mistral-7b-dpo-5.93B).

Downloads last month
63
Safetensors
Model size
5.93B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Mihaiii/dolphin-2.6-mistral-7b-dpo-5.93B