KoboldAI
/

LLaMA2-13B-Holomax-GGML

Model card Files Files and versions

Henk717 commited on Aug 17, 2023

Commit

3eea904

•

1 Parent(s): b7b7bc9

Update README.md

Browse files

Files changed (1) hide show

README.md +153 -5

README.md CHANGED Viewed

@@ -1,12 +1,10 @@
 ---
-license: llama2
 ---
-This is the GGML version of LaMA 2 Holomax 13B for use with [Koboldcpp](https://koboldai.org/cpp) for the original HF compatible release please check [KoboldAI/LLaMA2-13B-Holomax](https://huggingface.co/KoboldAI/LLaMA2-13B-Holomax)
 # LLaMA 2 Holomax 13B - The writers version of Mythomax
 This is an expansion merge to the well praised Mythomax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%)
-The goal of this model is to enhance story writing capabilities while preserving the desirable traits of the Mythomax model.
 Testers found that this model passes the InteracTV benchmark, was useful for story writing, chatting and text adventures using Instruction mode.
 Preservation of factual knowledge has not been tested since we expect the original to be better in those use cases as this merge was focussed on fiction.
@@ -38,4 +36,154 @@ Instruction goes here
 ### Response:
 ```
-But if you have a different preferred format that works on one of the models above it will likely still work.

 ---
+license: other
 ---
 # LLaMA 2 Holomax 13B - The writers version of Mythomax
 This is an expansion merge to the well praised Mythomax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%)
+The goal of this model is to enhance story writing capabilities while preserving the desirable traits of the Mythomax model as much as possible (It does limit chat reply length).
 Testers found that this model passes the InteracTV benchmark, was useful for story writing, chatting and text adventures using Instruction mode.
 Preservation of factual knowledge has not been tested since we expect the original to be better in those use cases as this merge was focussed on fiction.
 ### Response:
 ```
+But if you have a different preferred format that works on one of the models above it will likely still work.
+## License
+After publishing the model we were informed that one of the origin models upstream was uploaded under the AGPLv3, it is currently unknown what effects this has on this model because all weights have been modified and none of the original weights are intact.
+At the moment of publishing (and writing this message) both merged models Holodeck and Mythomax were licensed Llama2, therefore the Llama2 license applies to this model.
+However, Holodeck contains a non-commercial clause and may only be used for research or private use, while Limarp is licensed AGPLv3.
+AGPLv3 conflicts with the commercial usage restrictions of the Llama2 license, therefore we assume this aspect does not apply and the authors indended for commercial usage restrictions to be permitted.
+As a result we have decided to leave the model available for public download on the assumption that all involved authors intend for it to be licensed with commercial restrictions / llama2 restrictions in place, but with the further rights and freedoms the AGPLv3 grants a user.
+If HF informs us that this assumption is incorrect and requests us to take this model down, we will republish the model in the form of the original merging script that was used to create the end result.
+To comply with the AGPLv3 aspect the "source" of this model is as follows (Because this model is made on a binary level, we can only provide the script that created the model):
+```
+import json
+import os
+import shutil
+import subprocess
+from tkinter.filedialog import askdirectory, askopenfilename
+import torch
+from colorama import Fore, Style, init
+from transformers import (AutoModel, AutoModelForCausalLM, AutoTokenizer,
+                          LlamaConfig, LlamaForCausalLM, LlamaTokenizer,
+                          PreTrainedTokenizer, PreTrainedTokenizerFast)
+newline = '\n'
+def clear_console():
+    if os.name == "nt":  # For Windows
+        subprocess.call("cls", shell=True)
+    else:  # For Linux and macOS
+        subprocess.call("clear", shell=True)
+clear_console()
+print(f"{Fore.YELLOW}Starting script, please wait...{Style.RESET_ALL}")
+#mixer output settings
+blend_ratio = 0.4           #setting to 0 gives first model, and 1 gives second model
+fp16 = False                 #perform operations in fp16. Saves memory, but CPU inference will not be possible.
+always_output_fp16 = True   #if true, will output fp16 even if operating in fp32
+max_shard_size = "10000MiB"  #set output shard size
+force_cpu = True            #only use cpu
+load_sharded = True         #load both models shard by shard
+print(f"Blend Ratio set to: {Fore.GREEN}{blend_ratio}{Style.RESET_ALL}")
+print(f"Operations in fp16 is: {Fore.GREEN}{fp16}{Style.RESET_ALL}")
+print(f"Save Result in fp16: {Fore.GREEN}{always_output_fp16}{Style.RESET_ALL}")
+print(f"CPU RAM Only: {Fore.GREEN}{force_cpu}{Style.RESET_ALL}{newline}")
+#test generation settings, only for fp32
+deterministic_test = True   #determines if outputs are always the same
+test_prompt = ""    #test prompt for generation. only for fp32. set to empty string to skip generating.
+test_max_length = 32        #test generation length
+blend_ratio_b = 1.0 - blend_ratio
+def get_model_info(model):
+    with torch.no_grad():
+        outfo = ""
+        cntent = 0
+        outfo += "\n==============================\n"
+        for name, para in model.named_parameters():
+            cntent += 1
+            outfo += ('{}: {}'.format(name, para.shape))+"\n"
+        outfo += ("Num Entries: " + str(cntent))+"\n"
+        outfo += ("==============================\n")
+        return outfo
+def merge_models(model1,model2):
+    with torch.no_grad():
+        tensornum = 0
+        for p1, p2 in zip(model1.parameters(), model2.parameters()):
+           p1 *= blend_ratio
+           p2 *= blend_ratio_b
+           p1 += p2
+           tensornum += 1
+           print("Merging tensor "+str(tensornum))
+           pass
+def read_index_filenames(sourcedir):
+    index = json.load(open(sourcedir + '/pytorch_model.bin.index.json','rt'))
+    fl = []
+    for k,v in index['weight_map'].items():
+        if v not in fl:
+            fl.append(v)
+    return fl
+print("Opening file dialog, please select FIRST model directory...")
+model_path1 = "Gryphe/MythoMax-L2-13b"
+print(f"First Model is: {model_path1}")
+print("Opening file dialog, please select SECOND model directory...")
+model_path2 = "KoboldAI/LLAMA2-13B-Holodeck-1"
+print(f"Second Model is: {model_path2}")
+print("Opening file dialog, please select OUTPUT model directory...")
+model_path3 = askdirectory(title="Select Output Directory of merged model")
+print(f"Merged Save Directory is: {model_path3}{newline}")
+if not model_path1 or not model_path2:
+    print("\nYou must select two directories containing models to merge and one output directory. Exiting.")
+    exit()
+with torch.no_grad():
+    if fp16:
+        torch.set_default_dtype(torch.float16)
+    else:
+        torch.set_default_dtype(torch.float32)
+    device = torch.device("cuda") if (torch.cuda.is_available() and not force_cpu) else torch.device("cpu")
+    print(device)
+    print("Loading Model 1...")
+    model1 = AutoModelForCausalLM.from_pretrained(model_path1) #,torch_dtype=torch.float16
+    model1 = model1.to(device)
+    model1.eval()
+    print("Model 1 Loaded. Dtype: " + str(model1.dtype))
+    print("Loading Model 2...")
+    model2 = AutoModelForCausalLM.from_pretrained(model_path2) #,torch_dtype=torch.float16
+    model2 = model2.to(device)
+    model2.eval()
+    print("Model 2 Loaded. Dtype: " + str(model2.dtype))
+#   Saving for posterity reasons, handy for troubleshooting if model result is broken
+#    #ensure both models have the exact same layout
+#    m1_info = get_model_info(model1)
+#    m2_info = get_model_info(model2)
+#    if m1_info != m2_info:
+#        print("Model 1 Info: " + m1_info)
+#        print("Model 2 Info: " + m2_info)
+#        print("\nERROR:\nThe two selected models are not compatible! They must have identical structure!")
+#        exit()
+    print("Merging models...")
+    merge_models(model1,model2)
+    if model_path3:
+        print("Saving new model...")
+        if always_output_fp16 and not fp16:
+            model1.half()
+        model1.save_pretrained(model_path3, max_shard_size=max_shard_size)
+        print("\nSaved to: " + model_path3)
+        print("\nCopying files to: " + model_path3)
+        files_to_copy = ["tokenizer.model", "special_tokens_map.json", "tokenizer_config.json", "vocab.json", "merges.txt"]
+        for filename in files_to_copy:
+            src_path = os.path.join(model_path1, filename)
+            dst_path = os.path.join(model_path3, filename)
+            try:
+                shutil.copy2(src_path, dst_path)
+            except FileNotFoundError:
+                print("\nFile " + filename + " not found in" + model_path1 + ". Skipping.")
+        else:
+            print("\nOutput model was not saved as no output path was selected.")
+    print("\nScript Completed.")
+```