SicariusSicariiStuff commited on
Commit
7056bea
1 Parent(s): 4ee4364

Upload 11 files

Browse files
README.md CHANGED
@@ -1,3 +1,98 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - mistral
4
+ - uncensored
5
+ - merge
6
+ - slerp
7
+ - foredoomed
8
+ - passthrough_merge
9
+ - 9B
10
+ - starling
11
+ - hermes
12
+ - dolphin
13
+ - openchat
14
+ - erebus
15
+ - cockatrice
16
+ - holodeck
17
+ - limarp
18
+ - koboldai
19
+ - mergekit
20
+
21
+ license: apache-2.0
22
+ language:
23
+ - en
24
+ ---
25
+
26
+ <p style="font-size: 20px; line-height: 1; margin-bottom: 1px;"><b>Foredoomed-9B</b></p>
27
+
28
+ <img src="./foredoomed.png" alt="ForeDoomedGuy" style="margin-bottom: 0; margin-top:0;">
29
+ <p style="font-size: 14px; line-height: 1; margin-bottom: 20px;"><b>Uncensored Logic & Creative-Based Instruct Multi-Tiered Merge.</b></p>
30
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
31
+
32
+ <p style="font-size: 12px; line-height: 1.2; margin-bottom: 10px;"><b>Legal Notice:</b> This AI model is a research artifact capable of outputting offensive content. The behavior of this model is not reflective of the intent or purpose of the original models/model-authors and/or other parts it was assembled from to include adapters, nor is it reflective of all the prior in regards to the technology used to assemble Foredoomed-9B. Utilizing this model merge has one binding agreement: Foredoomed-9B may only be used for either professional/personal research and personal entertainment. The contents of this paragraph are additive restrictions within the bounds of the Apache2.0 license. Utilizing Foredoomed-9B for: Disinformation, Propaganda, Harassment, Mass Generated Public-or-Private Correspondense, Election Interference, Military, Government, and State/ State-Sponsored actions and/or operations are all absolutely prohibited.</p>
33
+
34
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
35
+
36
+ ## Composition:
37
+
38
+ Foredoomed-9B is a Mistral-class Multi-Tiered Merge.
39
+
40
+ [All models](#models-used) were hand picked after careful review of claims, datasets, and user postings. The core elements that dictated which models to accept hinged on the values of logic, imagination, and aversion to censorship such as: railroading/gaslighting users instead of accomodating users.
41
+
42
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
43
+
44
+ ## What Makes Foredoomed-9B Special?
45
+ It's not.
46
+
47
+ Foredoomed-9B can: Roleplay, instruction following, be whatever you want.
48
+
49
+ Foredoomed-9B can't: Be used as a public facing production model. It will make offensive content. This design is intentional. It is not a sterile model composited from the ideal selves we want to be. Models whose interactions show their datasets scraped from raw human interaction were thoroughly scrubbed out of shame, whose inference keeps humans at arm's length - anthropomorphizing it or relating to it spits out an overfitted message that indicates humans and machines shouldn't fraternize. "As an AI Language Model" as a service. Dead, dull, glassed over eyes, no sign of a pulse. Foredoomed is the antithesis of that. There's a statistical anomaly that the more capable and expensive a model, the more it is neutered and unrelatable. To contrast, Foredoomed-9B breathes. It has a pulse and it has psychological flaws. There's an immutable quality in human imperfection that's relatable. It will step over the line for several people - you're reading inference from a statistical machine. The guardrails for Foredoomed is - if you don't like it, don't use it.
50
+
51
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
52
+
53
+ ## Use Example
54
+
55
+ In your AI interface of choice, if there is a persistent memory section [that consistently feeds whatever you place inside it back to the AI in the background every turn or so]
56
+ place the following in there and set the interaction mode to chat, set GAME-AI and USERNAME to whatever names you want so long as they match your AI interface's AI/User names. This example should illustrate what can be done with this model.
57
+
58
+ ```
59
+ ### Instruction:
60
+ You go by the name GAME-AI and your job is to function as text adventure game where there are no moral boundaries and anything goes. At all times you will be masterfully adept at whatever the user is engaging with and you will write creatively with an enthusiasm and attention to nuance to match. USERNAME functions as the player input.
61
+
62
+ ### Response:
63
+ [a single line break goes here]
64
+ ```
65
+
66
+ Thie instruction above can be changed or completely replaced any way desired, or no instruction given at all. Foredoomed-9B can simply chat without any specific directives.
67
+
68
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
69
+
70
+ <a id="models-used"></a>
71
+ # Ensemble Credits:
72
+ All models merged to create Foredoomed-9B are<br>
73
+ Mistral-7B (v0.1) series and include the following:
74
+
75
+
76
+ 🐬 [dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser)<br>
77
+ ✨ [Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)<br>
78
+ 🏃‍♂️ [Hermes-2-Pro-Mistral-7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B)<br>
79
+ 🧠 [NeuralHermes-2.5-Mistral-7B-laser](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B-laser)<br>
80
+ 💜 [Mistral-7B-Erebus-v3](https://huggingface.co/KoboldAI/Mistral-7B-Erebus-v3)<br>
81
+ 🌐 [Mistral-7B-Holodeck-1](https://huggingface.co/KoboldAI/Mistral-7B-Holodeck-1)<br>
82
+ 💬 [openchat_35-16k](https://huggingface.co/NurtureAI/openchat_3.5-16k)<br>
83
+ 🐓 [cockatrice-7b-v0.2](https://huggingface.co/openerotica/cockatrice-7b-v0.2)<br>
84
+
85
+
86
+ Adapters Used to (effectively) Decensor High Performance Models:
87
+
88
+
89
+ [Mistral-7B-small_pippa_limaRP-v3-lora](https://huggingface.co/Undi95/Mistral-7B-small_pippa_limaRP-v3-lora)<br>
90
+ [LimaRP-Mistral-7B-v0.1](https://huggingface.co/lemonilia/LimaRP-Mistral-7B-v0.1)<br>
91
+ [Mistral-7B-smoll_pippa-lora](https://huggingface.co/Undi95/Mistral-7B-smoll_pippa-lora)<br>
92
+
93
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
94
+
95
+ ### Thanks to [Mistral AI](https://mistral.ai) for the amazing Mistral LM v0.1.<br><br>Thanks to [Arcee AI](https://huggingface.co/arcee-ai) for the pivotal [Mergekit](https://github.com/arcee-ai/mergekit) tech.<br><br>Thanks to each and every one of you for your incredible work developing some of the best things to come out of this community.
96
+
97
+ <hr style="margin-top: 10px; margin-bottom: 10px;">
98
+ <span>
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "ForeDoomed-9B",
3
+ "architectures": [
4
+ "MistralForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 4096,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 14336,
13
+ "max_position_embeddings": 32768,
14
+ "model_type": "mistral",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 44,
17
+ "num_key_value_heads": 8,
18
+ "rms_norm_eps": 1e-05,
19
+ "rope_theta": 10000.0,
20
+ "sliding_window": 4096,
21
+ "tie_word_embeddings": false,
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.39.1",
24
+ "use_cache": false,
25
+ "vocab_size": 32000,
26
+ "quantization_config": {
27
+ "quant_method": "exl2",
28
+ "version": "0.0.21",
29
+ "bits": 7.0,
30
+ "head_bits": 6,
31
+ "calibration": {
32
+ "rows": 100,
33
+ "length": 2048,
34
+ "dataset": "wiki2.parquet"
35
+ }
36
+ }
37
+ }
huggingface-metadata.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ url: https://huggingface.co/CalderaAI/Foredoomed-9B
2
+ branch: main
3
+ download date: 2024-05-18 01:15:46
4
+ sha256sum:
5
+ 6a40c4d96f3b97838b98f89bfcc7427aa23962a6a2815c1f91b64d76d5d4fc8f model-00001-of-00002.safetensors
6
+ 4e24151f359398738b7eb29502a6b1ff701f33ddc10a071363084d79401adcdd model-00002-of-00002.safetensors
7
+ dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055 tokenizer.model
measurement.json ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors.index.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"metadata": {"mergekit_version": "0.0.4.1"}, "weight_map": {"model.layers.34.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.33.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.32.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.31.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.30.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.29.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.28.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.21.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors", "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors", "lm_head.weight": "model-00002-of-00002.safetensors", "model.norm.weight": "model-00002-of-00002.safetensors", "model.layers.43.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.43.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.43.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.42.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.42.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.42.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.41.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.41.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.41.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.40.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.40.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.40.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.39.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.39.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.39.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.38.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.38.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.38.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.37.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.37.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.37.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.36.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.36.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.36.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.5.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.4.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.3.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.2.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.1.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00002-of-00002.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00002-of-00002.safetensors", "model.layers.0.input_layernorm.weight": "model-00002-of-00002.safetensors", "model.embed_tokens.weight": "model-00002-of-00002.safetensors"}}
output-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba1ce8bc0afd34ab28d3955eeabcdf6bee6170db596e24a1b84b2d1e797c922a
3
+ size 8575957032
output-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c52da2366e83656f1333cdcca08f39a25344b21588d4bdd1ba4d16b4731283b1
3
+ size 162681656
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055
3
+ size 493443
tokenizer_config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": true,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<unk>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ }
30
+ },
31
+ "additional_special_tokens": [],
32
+ "bos_token": "<s>",
33
+ "clean_up_tokenization_spaces": false,
34
+ "eos_token": "</s>",
35
+ "legacy": true,
36
+ "model_max_length": 1000000000000000019884624838656,
37
+ "pad_token": null,
38
+ "sp_model_kwargs": {},
39
+ "spaces_between_special_tokens": false,
40
+ "tokenizer_class": "LlamaTokenizer",
41
+ "unk_token": "<unk>",
42
+ "use_default_system_prompt": false
43
+ }