TheDrummer commited on
Commit
d3a86e2
1 Parent(s): 0cd2ae2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ What is the 39B upscale?
2
+
3
+ ```yaml
4
+ merge_method: passthrough
5
+ slices:
6
+ - sources:
7
+ - layer_range: [0, 41]
8
+ model: unsloth/Mistral-Small-Instruct-2409
9
+ - sources:
10
+ - layer_range: [19, 41]
11
+ model: unsloth/Mistral-Small-Instruct-2409
12
+ parameters:
13
+ scale:
14
+ - filter: o_proj
15
+ value: 0.0
16
+ - filter: down_proj
17
+ value: 0.0
18
+ - value: 1.0
19
+ - sources:
20
+ - layer_range: [19, 41]
21
+ model: unsloth/Mistral-Small-Instruct-2409
22
+ parameters:
23
+ scale:
24
+ - filter: o_proj
25
+ value: 0.0
26
+ - filter: down_proj
27
+ value: 0.0
28
+ - value: 1.0
29
+ - sources:
30
+ - layer_range: [41, 55]
31
+ model: unsloth/Mistral-Small-Instruct-2409
32
+ ```
33
+
34
+ - Layers 0 to 18 are original
35
+ - Layers 19 to 41 are duplicated, zero'd out, and put in the middle twice
36
+ - Layers 42 to 54 are original
37
+ - down_proj and o_proj layers for the duplicated part have been nulled and will require healing to 'unignore' the added layers
38
+
39
+ ```
40
+ [ Unique ][ Duplicated ][ Unique ]
41
+ 0 ----------- 18 19 ------------ 41 42 ---------- 54
42
+ 34.5% 41.8% 23.7%
43
+ ```
44
+
45
+ ## Control Sample A (Nemo & Rocinante, similar training)
46
+ *Also note the layer sequence and other labels since it will be unreadable for the 39B*
47
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/EZN8Ci2_vAGmdq0WUyrpN.png)
48
+
49
+ ## Control Sample B (Small & Cydonia, similar training)
50
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/xdH_7fy9HuhSzaSE2-h4X.png)
51
+
52
+ ## Tunguska 39B trained with One Epoch vs. its base
53
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/X3-bHyQg03-QvZFvOhGp7.png)
54
+
55
+ ## Tunguska 39B trained with Two Epochs vs. its base
56
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/-dRSeXmPXdE3_g67iKT0K.png)
57
+
58
+ ## Tunguska 39B 1 Epoch vs 2 Epoch
59
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/cjKf37TrSJHmq0S0_PZyE.png)