Henk717 commited on
Commit
3eea904
1 Parent(s): b7b7bc9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +153 -5
README.md CHANGED
@@ -1,12 +1,10 @@
1
  ---
2
- license: llama2
3
  ---
4
- This is the GGML version of LaMA 2 Holomax 13B for use with [Koboldcpp](https://koboldai.org/cpp) for the original HF compatible release please check [KoboldAI/LLaMA2-13B-Holomax](https://huggingface.co/KoboldAI/LLaMA2-13B-Holomax)
5
-
6
  # LLaMA 2 Holomax 13B - The writers version of Mythomax
7
 
8
  This is an expansion merge to the well praised Mythomax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%)
9
- The goal of this model is to enhance story writing capabilities while preserving the desirable traits of the Mythomax model.
10
 
11
  Testers found that this model passes the InteracTV benchmark, was useful for story writing, chatting and text adventures using Instruction mode.
12
  Preservation of factual knowledge has not been tested since we expect the original to be better in those use cases as this merge was focussed on fiction.
@@ -38,4 +36,154 @@ Instruction goes here
38
 
39
  ### Response:
40
  ```
41
- But if you have a different preferred format that works on one of the models above it will likely still work.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
3
  ---
 
 
4
  # LLaMA 2 Holomax 13B - The writers version of Mythomax
5
 
6
  This is an expansion merge to the well praised Mythomax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%)
7
+ The goal of this model is to enhance story writing capabilities while preserving the desirable traits of the Mythomax model as much as possible (It does limit chat reply length).
8
 
9
  Testers found that this model passes the InteracTV benchmark, was useful for story writing, chatting and text adventures using Instruction mode.
10
  Preservation of factual knowledge has not been tested since we expect the original to be better in those use cases as this merge was focussed on fiction.
 
36
 
37
  ### Response:
38
  ```
39
+ But if you have a different preferred format that works on one of the models above it will likely still work.
40
+
41
+ ## License
42
+ After publishing the model we were informed that one of the origin models upstream was uploaded under the AGPLv3, it is currently unknown what effects this has on this model because all weights have been modified and none of the original weights are intact.
43
+ At the moment of publishing (and writing this message) both merged models Holodeck and Mythomax were licensed Llama2, therefore the Llama2 license applies to this model.
44
+ However, Holodeck contains a non-commercial clause and may only be used for research or private use, while Limarp is licensed AGPLv3.
45
+ AGPLv3 conflicts with the commercial usage restrictions of the Llama2 license, therefore we assume this aspect does not apply and the authors indended for commercial usage restrictions to be permitted.
46
+ As a result we have decided to leave the model available for public download on the assumption that all involved authors intend for it to be licensed with commercial restrictions / llama2 restrictions in place, but with the further rights and freedoms the AGPLv3 grants a user.
47
+
48
+ If HF informs us that this assumption is incorrect and requests us to take this model down, we will republish the model in the form of the original merging script that was used to create the end result.
49
+ To comply with the AGPLv3 aspect the "source" of this model is as follows (Because this model is made on a binary level, we can only provide the script that created the model):
50
+ ```
51
+ import json
52
+ import os
53
+ import shutil
54
+ import subprocess
55
+ from tkinter.filedialog import askdirectory, askopenfilename
56
+
57
+ import torch
58
+ from colorama import Fore, Style, init
59
+ from transformers import (AutoModel, AutoModelForCausalLM, AutoTokenizer,
60
+ LlamaConfig, LlamaForCausalLM, LlamaTokenizer,
61
+ PreTrainedTokenizer, PreTrainedTokenizerFast)
62
+
63
+ newline = '\n'
64
+ def clear_console():
65
+ if os.name == "nt": # For Windows
66
+ subprocess.call("cls", shell=True)
67
+ else: # For Linux and macOS
68
+ subprocess.call("clear", shell=True)
69
+
70
+ clear_console()
71
+ print(f"{Fore.YELLOW}Starting script, please wait...{Style.RESET_ALL}")
72
+
73
+ #mixer output settings
74
+ blend_ratio = 0.4 #setting to 0 gives first model, and 1 gives second model
75
+ fp16 = False #perform operations in fp16. Saves memory, but CPU inference will not be possible.
76
+ always_output_fp16 = True #if true, will output fp16 even if operating in fp32
77
+ max_shard_size = "10000MiB" #set output shard size
78
+ force_cpu = True #only use cpu
79
+ load_sharded = True #load both models shard by shard
80
+
81
+ print(f"Blend Ratio set to: {Fore.GREEN}{blend_ratio}{Style.RESET_ALL}")
82
+ print(f"Operations in fp16 is: {Fore.GREEN}{fp16}{Style.RESET_ALL}")
83
+ print(f"Save Result in fp16: {Fore.GREEN}{always_output_fp16}{Style.RESET_ALL}")
84
+ print(f"CPU RAM Only: {Fore.GREEN}{force_cpu}{Style.RESET_ALL}{newline}")
85
+
86
+ #test generation settings, only for fp32
87
+ deterministic_test = True #determines if outputs are always the same
88
+ test_prompt = "" #test prompt for generation. only for fp32. set to empty string to skip generating.
89
+ test_max_length = 32 #test generation length
90
+
91
+
92
+ blend_ratio_b = 1.0 - blend_ratio
93
+
94
+ def get_model_info(model):
95
+ with torch.no_grad():
96
+ outfo = ""
97
+ cntent = 0
98
+ outfo += "\n==============================\n"
99
+ for name, para in model.named_parameters():
100
+ cntent += 1
101
+ outfo += ('{}: {}'.format(name, para.shape))+"\n"
102
+ outfo += ("Num Entries: " + str(cntent))+"\n"
103
+ outfo += ("==============================\n")
104
+ return outfo
105
+
106
+ def merge_models(model1,model2):
107
+ with torch.no_grad():
108
+ tensornum = 0
109
+ for p1, p2 in zip(model1.parameters(), model2.parameters()):
110
+ p1 *= blend_ratio
111
+ p2 *= blend_ratio_b
112
+ p1 += p2
113
+ tensornum += 1
114
+ print("Merging tensor "+str(tensornum))
115
+ pass
116
+
117
+ def read_index_filenames(sourcedir):
118
+ index = json.load(open(sourcedir + '/pytorch_model.bin.index.json','rt'))
119
+ fl = []
120
+ for k,v in index['weight_map'].items():
121
+ if v not in fl:
122
+ fl.append(v)
123
+ return fl
124
+
125
+ print("Opening file dialog, please select FIRST model directory...")
126
+ model_path1 = "Gryphe/MythoMax-L2-13b"
127
+ print(f"First Model is: {model_path1}")
128
+ print("Opening file dialog, please select SECOND model directory...")
129
+ model_path2 = "KoboldAI/LLAMA2-13B-Holodeck-1"
130
+ print(f"Second Model is: {model_path2}")
131
+ print("Opening file dialog, please select OUTPUT model directory...")
132
+ model_path3 = askdirectory(title="Select Output Directory of merged model")
133
+ print(f"Merged Save Directory is: {model_path3}{newline}")
134
+ if not model_path1 or not model_path2:
135
+ print("\nYou must select two directories containing models to merge and one output directory. Exiting.")
136
+ exit()
137
+
138
+ with torch.no_grad():
139
+ if fp16:
140
+ torch.set_default_dtype(torch.float16)
141
+ else:
142
+ torch.set_default_dtype(torch.float32)
143
+
144
+ device = torch.device("cuda") if (torch.cuda.is_available() and not force_cpu) else torch.device("cpu")
145
+ print(device)
146
+
147
+ print("Loading Model 1...")
148
+ model1 = AutoModelForCausalLM.from_pretrained(model_path1) #,torch_dtype=torch.float16
149
+ model1 = model1.to(device)
150
+ model1.eval()
151
+ print("Model 1 Loaded. Dtype: " + str(model1.dtype))
152
+ print("Loading Model 2...")
153
+ model2 = AutoModelForCausalLM.from_pretrained(model_path2) #,torch_dtype=torch.float16
154
+ model2 = model2.to(device)
155
+ model2.eval()
156
+ print("Model 2 Loaded. Dtype: " + str(model2.dtype))
157
+
158
+ # Saving for posterity reasons, handy for troubleshooting if model result is broken
159
+ # #ensure both models have the exact same layout
160
+ # m1_info = get_model_info(model1)
161
+ # m2_info = get_model_info(model2)
162
+ # if m1_info != m2_info:
163
+ # print("Model 1 Info: " + m1_info)
164
+ # print("Model 2 Info: " + m2_info)
165
+ # print("\nERROR:\nThe two selected models are not compatible! They must have identical structure!")
166
+ # exit()
167
+
168
+ print("Merging models...")
169
+ merge_models(model1,model2)
170
+
171
+ if model_path3:
172
+ print("Saving new model...")
173
+ if always_output_fp16 and not fp16:
174
+ model1.half()
175
+ model1.save_pretrained(model_path3, max_shard_size=max_shard_size)
176
+ print("\nSaved to: " + model_path3)
177
+ print("\nCopying files to: " + model_path3)
178
+ files_to_copy = ["tokenizer.model", "special_tokens_map.json", "tokenizer_config.json", "vocab.json", "merges.txt"]
179
+ for filename in files_to_copy:
180
+ src_path = os.path.join(model_path1, filename)
181
+ dst_path = os.path.join(model_path3, filename)
182
+ try:
183
+ shutil.copy2(src_path, dst_path)
184
+ except FileNotFoundError:
185
+ print("\nFile " + filename + " not found in" + model_path1 + ". Skipping.")
186
+ else:
187
+ print("\nOutput model was not saved as no output path was selected.")
188
+ print("\nScript Completed.")
189
+ ```