Oumuamua-7b-base
This is a merge of pre-trained language models created using mergekit.
Output example
Input text
日本で最も高い山の名前は
Output text
日本で最も高い山の名前は、富士山。
その標高は3776メートル。
世界でも20位以内に入る高さを誇る。
その富士山の麓にあるのが、静岡県富士市。
富士市は、富士山の麓にあるため、観光地としても有名である。
富士山の麓にあることから、富士市は観光地としても有名である。
富士山を眺めることができるスポットが多く、特に富士市の中心部から見る富士山は、その美しさから「日本一の眺望」と言われている。
Test environment
This model was tested using text-generation-webui. I use preset min_p
and Null preset
with temperature=0.3 for Generation.
Usage
Use the base model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "nitky/Oumuamua-7b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
prompt = "日本で最も高い山の名前は"
input_ids = tokenizer.encode(
prompt,
add_special_tokens=False,
return_tensors="pt"
)
tokens = model.generate(
input_ids.to(device=model.device),
max_new_tokens=256,
do_sample=True,
temperature=0.3
)
out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)
Merge Details
Merge Method
This model was merged using the Model Stock merge method using tokyotech-llm/Swallow-MS-7b-v0.1 as a base.
Models Merged
The following models were included in the merge:
- tokyotech-llm/Swallow-MS-7b-v0.1
- mistralai/Mistral-7B-v0.1
- nitky/Flavor-7b
- stabilityai/japanese-stablelm-base-gamma-7b
Configuration
The following YAML configuration was used to produce this model:
merge_method: task_arithmetic
base_model: mistralai/Mistral-7B-v0.1
models:
- model: tokyotech-llm/Swallow-MS-7b-v0.1
parameters:
weight:
- filter: embed_tokens
value: 1.0
- value: 0
dtype: bfloat16
tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
name: Mistral-7B-v0.1-VE-Swallow-MS
---
merge_method: task_arithmetic
base_model: nitky/Flavor-7b # private model
models:
- model: tokyotech-llm/Swallow-MS-7b-v0.1
parameters:
weight:
- filter: embed_tokens
value: 1.0
- value: 0
dtype: bfloat16
tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
name: Flavor-7b-VE-Swallow-MS
---
merge_method: task_arithmetic
base_model: stabilityai/japanese-stablelm-base-gamma-7b
models:
- model: tokyotech-llm/Swallow-MS-7b-v0.1
parameters:
weight:
- filter: embed_tokens
value: 1.0
- value: 0
dtype: bfloat16
tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
name: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
---
merge_method: task_arithmetic
base_model: Mistral-7B-v0.1-VE-Swallow-MS
models:
- model: tokyotech-llm/Swallow-MS-7b-v0.1
parameters:
weight: 1.0
- model: Flavor-7b-VE-Swallow-MS
parameters:
weight: 0.5
- model: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
parameters:
weight: -0.5
dtype: bfloat16
name: Oumuamua-7b-base-preset
---
merge_method: model_stock
base_model: Mistral-7B-v0.1-VE-Swallow-MS
models:
- model: tokyotech-llm/Swallow-MS-7b-v0.1
- model: Oumuamua-7b-base-preset
dtype: bfloat16
name: Oumuamua-7b-base
- Downloads last month
- 12
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for nitky/Oumuamua-7b-base
Merge model
this model