|
--- |
|
base_model: |
|
- google/gemma-2-2b-it |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- google/gemma-2-2b-it |
|
--- |
|
|
|
# gemma-instruct-merge-test_ |
|
|
|
gemma-instruct-merge-test_ is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing): |
|
* [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) |
|
|
|
## 🧩 Configuration |
|
|
|
```yamlmodels: |
|
- model: google/gemma-2-2b |
|
- model: google/gemma-2-2b-it |
|
parameters: |
|
density: |
|
- filter: model.layers.1.self_attn.q_proj |
|
value: 0.00539 |
|
- filter: model.layers.2.self_attn.q_proj |
|
value: 0.03843 |
|
- filter: model.layers.6.self_attn.q_proj |
|
value: 0.03716 |
|
- filter: model.layers.24.self_attn.q_proj |
|
value: 0.04552 |
|
- filter: model.layers.25.self_attn.q_proj |
|
value: 0.03919 |
|
- filter: model.layers.0.self_attn.k_proj |
|
value: 0.00592 |
|
- filter: model.layers.2.self_attn.k_proj |
|
value: 0.02603 |
|
- filter: model.layers.3.self_attn.k_proj |
|
value: 0.07283 |
|
- filter: model.layers.8.self_attn.k_proj |
|
value: 0.08753 |
|
- filter: model.layers.10.self_attn.k_proj |
|
value: 0.07783 |
|
- filter: model.layers.23.self_attn.k_proj |
|
value: 0.05987 |
|
- filter: model.layers.24.self_attn.k_proj |
|
value: 0.02903 |
|
- filter: model.layers.25.self_attn.k_proj |
|
value: 0.08715 |
|
- filter: model.layers.0.self_attn.v_proj |
|
value: 0.03025 |
|
- filter: model.layers.2.self_attn.v_proj |
|
value: 0.00286 |
|
- filter: model.layers.3.self_attn.v_proj |
|
value: 0.09155 |
|
- filter: model.layers.6.self_attn.v_proj |
|
value: 0.06811 |
|
- filter: model.layers.7.self_attn.v_proj |
|
value: 0.01334 |
|
- filter: model.layers.10.self_attn.v_proj |
|
value: 0.04 |
|
- filter: model.layers.13.self_attn.v_proj |
|
value: 0.09347 |
|
- filter: model.layers.24.self_attn.v_proj |
|
value: 0.07956 |
|
- filter: model.layers.0.self_attn.o_proj |
|
value: 0.03265 |
|
- filter: model.layers.2.self_attn.o_proj |
|
value: 0.06134 |
|
- filter: model.layers.4.self_attn.o_proj |
|
value: 0.07924 |
|
- filter: model.layers.6.self_attn.o_proj |
|
value: 0.09982 |
|
- filter: model.layers.7.self_attn.o_proj |
|
value: 0.02826 |
|
- filter: model.layers.8.self_attn.o_proj |
|
value: 0.03906 |
|
- filter: model.layers.19.self_attn.o_proj |
|
value: 0.09507 |
|
- filter: model.layers.23.self_attn.o_proj |
|
value: 0.00282 |
|
- filter: model.layers.24.self_attn.o_proj |
|
value: 0.09864 |
|
- filter: model.layers.25.self_attn.o_proj |
|
value: 0.00961 |
|
- filter: model.layers.1.mlp.gate_proj |
|
value: 0.08775 |
|
- filter: model.layers.2.mlp.gate_proj |
|
value: 0.0001 |
|
- filter: model.layers.6.mlp.gate_proj |
|
value: 0.06577 |
|
- filter: model.layers.12.mlp.gate_proj |
|
value: 0.02651 |
|
- filter: model.layers.13.mlp.gate_proj |
|
value: 0.04687 |
|
- filter: model.layers.15.mlp.gate_proj |
|
value: 0.03147 |
|
- filter: model.layers.16.mlp.gate_proj |
|
value: 0.05726 |
|
- filter: model.layers.17.mlp.gate_proj |
|
value: 0.04511 |
|
- filter: model.layers.23.mlp.gate_proj |
|
value: 0.08641 |
|
- filter: model.layers.1.mlp.up_proj |
|
value: 0.06887 |
|
- filter: model.layers.6.mlp.up_proj |
|
value: 0.07411 |
|
- filter: model.layers.7.mlp.up_proj |
|
value: 0.05424 |
|
- filter: model.layers.12.mlp.up_proj |
|
value: 0.08044 |
|
- filter: model.layers.13.mlp.up_proj |
|
value: 0.0021 |
|
- filter: model.layers.14.mlp.up_proj |
|
value: 0.26389 |
|
- filter: model.layers.15.mlp.up_proj |
|
value: 0.06886 |
|
- filter: model.layers.23.mlp.up_proj |
|
value: 0.02931 |
|
- filter: model.layers.0.mlp.down_proj |
|
value: 0.06756 |
|
- filter: model.layers.1.mlp.down_proj |
|
value: 0.03746 |
|
- filter: model.layers.2.mlp.down_proj |
|
value: 0.09104 |
|
- filter: model.layers.3.mlp.down_proj |
|
value: 0.06643 |
|
- filter: model.layers.4.mlp.down_proj |
|
value: 0.05003 |
|
- filter: model.layers.5.mlp.down_proj |
|
value: 0.0406 |
|
- filter: model.layers.6.mlp.down_proj |
|
value: 0.01609 |
|
- filter: model.layers.7.mlp.down_proj |
|
value: 0.09629 |
|
- filter: model.layers.8.mlp.down_proj |
|
value: 0.08912 |
|
- filter: model.layers.10.mlp.down_proj |
|
value: 0.04635 |
|
- filter: model.layers.11.mlp.down_proj |
|
value: 0.0099 |
|
- filter: model.layers.12.mlp.down_proj |
|
value: 0.03487 |
|
- filter: model.layers.13.mlp.down_proj |
|
value: 0.04977 |
|
- filter: model.layers.14.mlp.down_proj |
|
value: 0.00393 |
|
- filter: model.layers.15.mlp.down_proj |
|
value: 0.00748 |
|
- filter: model.layers.16.mlp.down_proj |
|
value: 0.06696 |
|
- filter: model.layers.17.mlp.down_proj |
|
value: 0.02067 |
|
- filter: model.layers.19.mlp.down_proj |
|
value: 0.009 |
|
- filter: model.layers.20.mlp.down_proj |
|
value: 0.0215 |
|
- filter: model.layers.21.mlp.down_proj |
|
value: 0.04196 |
|
- filter: model.layers.22.mlp.down_proj |
|
value: 0.06326 |
|
- filter: model.layers.25.mlp.down_proj |
|
value: 0.04905 |
|
weight: |
|
- value: 1 |
|
merge_method: ties |
|
base_model: google/gemma-2-2b |
|
parameters: |
|
normalize: true |
|
int8_mask: true |
|
dtype: bfloat16 |
|
tokenizer_source: union |
|
``` |
|
|
|
## 💻 Usage |
|
|
|
```python |
|
!pip install -qU transformers accelerate |
|
|
|
from transformers import AutoTokenizer |
|
import transformers |
|
import torch |
|
|
|
model = "choprahetarth/gemma-instruct-merge-test_" |
|
messages = [{"role": "user", "content": "What is a large language model?"}] |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model) |
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=model, |
|
torch_dtype=torch.float16, |
|
device_map="auto", |
|
) |
|
|
|
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) |
|
print(outputs[0]["generated_text"]) |
|
``` |