--- base_model: - tokyotech-llm/Swallow-70b-NVE-instruct-hf - dreamgen/opus-v0.5-70b - GOAT-AI/GOAT-70B-Storytelling - Doctor-Shotgun/lzlv-limarpv3-l2-70b - alac/Waxwing-Storytelling-70B-LoRA tags: - mergekit - merge language: - en - ja library_name: transformers pipeline_tag: text-generation license: llama2 model_type: llama --- # Swallow-70b-NVE-RP **Important Notice:** For personal and academic use only. ## Description This model is suitable for role-playing and storytelling, but it's not a great model for multi-turn chat. This was created for personal and academic use only. This merge model uses only fine-tune models of Llama2, but some of the models used include those whose licenses for commercial use are unclear. If there is a license problem, the rights holder should contact me directly. No license changes will be made due to contact from others. ## Test environment This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `simple-1` and `Null preset` for Generation. ### Recommendation Use `simple-1` settings: - temperature: 0.7 - top_p: 0.9 - repetition_penalty: 1.15 - top_k: 20 ### Tested `temperature` Range - temperature: 0.3 - 1.0 ### Tested `repetition_penalty` Range - repetition_penalty: 1.0 - 1.15 ## Prompt template ### Swallow Style (Alpaca format) ``` 以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。リクエストを適切に完了するための回答を記述してください。 ### 指示: {instruction} ### 応答: ``` Although not fully tested, [Doctor-Shotgun/lzlv-limarpv3-l2-70b](Doctor-Shotgun/lzlv-limarpv3-l2-70b) and [alac/Waxwing-Storytelling-70B-LoRA](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA) prompt styles are also available. ## Use the instruct model ``` import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "nitky/Swallow-70b-NVE-RP" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, device_map="auto", load_in_4bit = True) PROMPT_DICT = { "prompt_input": ( "以下に、あるタスクを説明する指示があり、それに付随する入力が更なる文脈を提供しています。" "リクエストを適切に完了するための回答を記述してください。\n\n" "### 指示:\n{instruction}\n\n### 入力:\n{input}\n\n### 応答:" ), "prompt_no_input": ( "以下に、あるタスクを説明する指示があります。" "リクエストを適切に完了するための回答を記述してください。\n\n" "### 指示:\n{instruction}\n\n### 応答:" ), } def create_prompt(instruction, input=None): """ Generates a prompt based on the given instruction and an optional input. If input is provided, it uses the 'prompt_input' template from PROMPT_DICT. If no input is provided, it uses the 'prompt_no_input' template. Args: instruction (str): The instruction describing the task. input (str, optional): Additional input providing context for the task. Default is None. Returns: str: The generated prompt. """ if input: # Use the 'prompt_input' template when additional input is provided return PROMPT_DICT["prompt_input"].format(instruction=instruction, input=input) else: # Use the 'prompt_no_input' template when no additional input is provided return PROMPT_DICT["prompt_no_input"].format(instruction=instruction) # Example usage instruction_example = "以下のトピックに関する詳細な情報を提供してください。" input_example = "東京工業大学の主なキャンパスについて教えてください" prompt = create_prompt(instruction_example, input_example) input_ids = tokenizer.encode( prompt, add_special_tokens=False, return_tensors="pt" ) tokens = model.generate( input_ids.to(device=model.device), max_new_tokens=200, temperature=0.7, top_p=0.9, repetition_penalty=1.15, top_k=20, do_sample=True, ) out = tokenizer.decode(tokens[0], skip_special_tokens=True) print(out) ``` ## Merge Details ### Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) and the SLERP merge method using [tokyotech-llm/Swallow-70b-NVE-instruct-hf](https://huggingface.co/tokyotech-llm/Swallow-70b-NVE-instruct-hf) as a base. ### Models Merged The following models were included in the merge: * [GOAT-AI/GOAT-70B-Storytelling](https://huggingface.co/GOAT-AI/GOAT-70B-Storytelling) * [dreamgen/opus-v0.5-70b](https://huggingface.co/dreamgen/opus-v0.5-70b) * [Doctor-Shotgun/lzlv-limarpv3-l2-70b](Doctor-Shotgun/lzlv-limarpv3-l2-70b) * [LoRA] [alac/Waxwing-Storytelling-70B-LoRA](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA) ### Configuration The command example: ```bash # please change the path and options according to your environment mergekit-mega --cuda --lora-merge-cache ~/text-generation-webui/loras/models--alac--Waxwing-Storytelling-70B-LoRA Swallow-70b-NVE-RP.yml ~/text-generation-webui/models ``` The following YAML configuration was used to produce this model: ```yaml models: - model: tokyotech-llm/Swallow-70b-NVE-instruct-hf # no parameters necessary for base model - model: GOAT-AI/GOAT-70B-Storytelling # storytelling parameters: density: 1 weight: 0.25 - model: dreamgen/opus-v0.5-70b # creative roleplay parameters: density: 1 weight: 0.25 merge_method: dare_ties base_model: tokyotech-llm/Swallow-70b-NVE-instruct-hf dtype: bfloat16 name: Swallow-70b-NVE-RP-base --- models: - model: tokyotech-llm/Swallow-70b-NVE-instruct-hf # no parameters necessary for base model - model: Doctor-Shotgun/lzlv-limarpv3-l2-70b # roleplay configuration parameters: density: 1 weight: 0.25 merge_method: dare_ties base_model: tokyotech-llm/Swallow-70b-NVE-instruct-hf dtype: bfloat16 name: Swallow-70b-NVE-RP-flavor --- slices: - sources: - model: Swallow-70b-NVE-RP-base layer_range: [0, 80] - model: Swallow-70b-NVE-RP-flavor layer_range: [0, 80] merge_method: slerp base_model: Swallow-70b-NVE-RP-base parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 # fallback for rest of tensors dtype: bfloat16 name: Swallow-70b-NVE-RP ```