Llama3-Evolve-v3 / README.md
kawagoshi-llm-team's picture
Upload folder using huggingface_hub
8307209 verified
metadata
base_model: []
library_name: transformers
tags:
  - mergekit
  - merge

final_merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863 as a base.

Models Merged

The following models were included in the merge:

  • /content/drive/MyDrive/evolve-merge-v3/input_models/llama3_sft_many_chat_3038212730
  • /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_not_our_data_rope_1173759365

Configuration

The following YAML configuration was used to produce this model:

base_model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863
dtype: bfloat16
merge_method: dare_ties
parameters:
  int8_mask: 1.0
  normalize: 1.0
slices:
- sources:
  - layer_range: [0, 8]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863
    parameters:
      density: 1.0
      weight: 0.42824024801355876
  - layer_range: [0, 8]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_not_our_data_rope_1173759365
    parameters:
      density: 1.0
      weight: 0.4207395136362646
  - layer_range: [0, 8]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llama3_sft_many_chat_3038212730
    parameters:
      density: 0.9260851266272763
      weight: 0.540457958041171
- sources:
  - layer_range: [8, 16]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863
    parameters:
      density: 0.9833172640328212
      weight: 0.268067688260763
  - layer_range: [8, 16]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_not_our_data_rope_1173759365
    parameters:
      density: 0.9285327276319923
      weight: 0.21816783096494735
  - layer_range: [8, 16]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llama3_sft_many_chat_3038212730
    parameters:
      density: 0.7425251863806608
      weight: 0.3916952231138195
- sources:
  - layer_range: [16, 24]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863
    parameters:
      density: 0.9471168686712123
      weight: 0.394119335607417
  - layer_range: [16, 24]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_not_our_data_rope_1173759365
    parameters:
      density: 1.0
      weight: 0.1896990456842347
  - layer_range: [16, 24]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llama3_sft_many_chat_3038212730
    parameters:
      density: 1.0
      weight: 0.37101085991686256
- sources:
  - layer_range: [24, 32]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863
    parameters:
      density: 1.0
      weight: 0.4567962692223412
  - layer_range: [24, 32]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_not_our_data_rope_1173759365
    parameters:
      density: 1.0
      weight: 0.1946028047831424
  - layer_range: [24, 32]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llama3_sft_many_chat_3038212730
    parameters:
      density: 0.9040231973058573
      weight: 0.7157785215240597
- sources:
  - layer_range: [32, 40]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_our_data_rope_3108389863
    parameters:
      density: 0.9456870656985836
      weight: 0.3411963802463747
  - layer_range: [32, 40]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llma3_manydata_not_our_data_rope_1173759365
    parameters:
      density: 0.6103424321359935
      weight: 0.2728333755777217
  - layer_range: [32, 40]
    model: /content/drive/MyDrive/evolve-merge-v3/input_models/llama3_sft_many_chat_3038212730
    parameters:
      density: 1.0
      weight: 0.6645674564408244
tokenizer_source: base