--- base_model: - google/gemma-2-2b-it tags: - merge - mergekit - lazymergekit - google/gemma-2-2b-it --- # gemma-instruct-merge-test_ gemma-instruct-merge-test_ is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing): * [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) ## 🧩 Configuration ```yamlmodels: - model: google/gemma-2-2b - model: google/gemma-2-2b-it parameters: density: - filter: model.layers.1.self_attn.q_proj value: 0.00539 - filter: model.layers.2.self_attn.q_proj value: 0.03843 - filter: model.layers.6.self_attn.q_proj value: 0.03716 - filter: model.layers.24.self_attn.q_proj value: 0.04552 - filter: model.layers.25.self_attn.q_proj value: 0.03919 - filter: model.layers.0.self_attn.k_proj value: 0.00592 - filter: model.layers.2.self_attn.k_proj value: 0.02603 - filter: model.layers.3.self_attn.k_proj value: 0.07283 - filter: model.layers.8.self_attn.k_proj value: 0.08753 - filter: model.layers.10.self_attn.k_proj value: 0.07783 - filter: model.layers.23.self_attn.k_proj value: 0.05987 - filter: model.layers.24.self_attn.k_proj value: 0.02903 - filter: model.layers.25.self_attn.k_proj value: 0.08715 - filter: model.layers.0.self_attn.v_proj value: 0.03025 - filter: model.layers.2.self_attn.v_proj value: 0.00286 - filter: model.layers.3.self_attn.v_proj value: 0.09155 - filter: model.layers.6.self_attn.v_proj value: 0.06811 - filter: model.layers.7.self_attn.v_proj value: 0.01334 - filter: model.layers.10.self_attn.v_proj value: 0.04 - filter: model.layers.13.self_attn.v_proj value: 0.09347 - filter: model.layers.24.self_attn.v_proj value: 0.07956 - filter: model.layers.0.self_attn.o_proj value: 0.03265 - filter: model.layers.2.self_attn.o_proj value: 0.06134 - filter: model.layers.4.self_attn.o_proj value: 0.07924 - filter: model.layers.6.self_attn.o_proj value: 0.09982 - filter: model.layers.7.self_attn.o_proj value: 0.02826 - filter: model.layers.8.self_attn.o_proj value: 0.03906 - filter: model.layers.19.self_attn.o_proj value: 0.09507 - filter: model.layers.23.self_attn.o_proj value: 0.00282 - filter: model.layers.24.self_attn.o_proj value: 0.09864 - filter: model.layers.25.self_attn.o_proj value: 0.00961 - filter: model.layers.1.mlp.gate_proj value: 0.08775 - filter: model.layers.2.mlp.gate_proj value: 0.0001 - filter: model.layers.6.mlp.gate_proj value: 0.06577 - filter: model.layers.12.mlp.gate_proj value: 0.02651 - filter: model.layers.13.mlp.gate_proj value: 0.04687 - filter: model.layers.15.mlp.gate_proj value: 0.03147 - filter: model.layers.16.mlp.gate_proj value: 0.05726 - filter: model.layers.17.mlp.gate_proj value: 0.04511 - filter: model.layers.23.mlp.gate_proj value: 0.08641 - filter: model.layers.1.mlp.up_proj value: 0.06887 - filter: model.layers.6.mlp.up_proj value: 0.07411 - filter: model.layers.7.mlp.up_proj value: 0.05424 - filter: model.layers.12.mlp.up_proj value: 0.08044 - filter: model.layers.13.mlp.up_proj value: 0.0021 - filter: model.layers.14.mlp.up_proj value: 0.26389 - filter: model.layers.15.mlp.up_proj value: 0.06886 - filter: model.layers.23.mlp.up_proj value: 0.02931 - filter: model.layers.0.mlp.down_proj value: 0.06756 - filter: model.layers.1.mlp.down_proj value: 0.03746 - filter: model.layers.2.mlp.down_proj value: 0.09104 - filter: model.layers.3.mlp.down_proj value: 0.06643 - filter: model.layers.4.mlp.down_proj value: 0.05003 - filter: model.layers.5.mlp.down_proj value: 0.0406 - filter: model.layers.6.mlp.down_proj value: 0.01609 - filter: model.layers.7.mlp.down_proj value: 0.09629 - filter: model.layers.8.mlp.down_proj value: 0.08912 - filter: model.layers.10.mlp.down_proj value: 0.04635 - filter: model.layers.11.mlp.down_proj value: 0.0099 - filter: model.layers.12.mlp.down_proj value: 0.03487 - filter: model.layers.13.mlp.down_proj value: 0.04977 - filter: model.layers.14.mlp.down_proj value: 0.00393 - filter: model.layers.15.mlp.down_proj value: 0.00748 - filter: model.layers.16.mlp.down_proj value: 0.06696 - filter: model.layers.17.mlp.down_proj value: 0.02067 - filter: model.layers.19.mlp.down_proj value: 0.009 - filter: model.layers.20.mlp.down_proj value: 0.0215 - filter: model.layers.21.mlp.down_proj value: 0.04196 - filter: model.layers.22.mlp.down_proj value: 0.06326 - filter: model.layers.25.mlp.down_proj value: 0.04905 weight: - value: 1 merge_method: ties base_model: google/gemma-2-2b parameters: normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: union ``` ## 💻 Usage ```python !pip install -qU transformers accelerate from transformers import AutoTokenizer import transformers import torch model = "choprahetarth/gemma-instruct-merge-test_" messages = [{"role": "user", "content": "What is a large language model?"}] tokenizer = AutoTokenizer.from_pretrained(model) prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) pipeline = transformers.pipeline( "text-generation", model=model, torch_dtype=torch.float16, device_map="auto", ) outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs[0]["generated_text"]) ```