metadata
language:
- ja
tags:
- merge
- mergekit
- lazymergekit
- SakanaAI/EvoLLM-JP-A-v1-7B
- stabilityai/japanese-stablelm-base-gamma-7b
base_model:
- SakanaAI/EvoLLM-JP-A-v1-7B
- stabilityai/japanese-stablelm-base-gamma-7b
Hinoki-Sak-Sta-slerp-7B
Hinoki-Sak-Sta-slerp-7B is a merge of the following models using the LazyMergekit of Maxime Labonne powered by MergeKit of Arcee AI:
🧩 Configuration
slices:
- sources:
- model: SakanaAI/EvoLLM-JP-A-v1-7B
layer_range: [0, 32]
- model: stabilityai/japanese-stablelm-base-gamma-7b
layer_range: [0, 32]
merge_method: slerp
base_model: SakanaAI/EvoLLM-JP-A-v1-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
💻 Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "AkimfromParis/Hinoki-Sak-Sta-slerp-7B"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
model.eval()
requests = [
"大谷翔平選手について教えてください",
]
system_message = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {user_input} ASSISTANT:"
for req in requests:
input_req = system_message.format(user_input=req)
input_ids = tokenizer.encode(input_req, return_tensors="pt").to(device=model.device)
tokens = model.generate(
input_ids,
max_new_tokens=1024,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
out = tokenizer.decode(tokens[0][len(input_ids[0]):], skip_special_tokens=True)
print("USER:\n" + req)
print("ASSISTANT:\n" + out)
print()
Citation
@article{goddard2024arcee,
title={Arcee's MergeKit: A Toolkit for Merging Large Language Models},
author={Goddard, Charles and Siriwardhana, Shamane and Ehghaghi, Malikeh and Meyers, Luke and Karpukhin, Vlad and Benedict, Brian and McQuade, Mark and Solawetz, Jacob},
journal={arXiv preprint arXiv:2403.13257},
year={2024}
}
arxiv.org/abs/2403.13257