試製-暮光-7B
試製-暮光-7B 是用LazyMergekit融合以下模型生成的:
這是一個實驗模型,目的是爲了檢驗套用在不同語言上的高品質模型調教是否能夠轉移(此模型爲英文到中文)。
shizhi-twilight-7B
shizhi-twilight-7B is a merge of the following models using LazyMergekit:
This is an experiment product on checking whether high quality fine-tuning on one language (English) could be transferred to another language (Mandarin) leveraging Slerp merge method.
🧩 Configuration
slices:
- sources:
- model: MediaTek-Research/Breeze-7B-Instruct-v0_1
layer_range: [0, 32]
- model: argilla/CapybaraHermes-2.5-Mistral-7B
layer_range: [0, 32]
merge_method: slerp
base_model: MediaTek-Research/Breeze-7B-Instruct-v0_1
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
💻 Usage
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "lipcut/shizhi-twilight-7B"
messages = [{"role": "user", "content": "什麼是大型語言模型?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for lipcut/shizhi-twilight-7B
Merge model
this model