pranavajay commited on
Commit
9f86643
1 Parent(s): 1bf0836

Upload log.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. log.txt +1160 -0
log.txt ADDED
@@ -0,0 +1,1160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Tensor Name: context_embedder.bias, Size: torch.Size([2150])
2
+ Tensor Name: context_embedder.weight, Size: torch.Size([2150, 4096])
3
+ Tensor Name: norm_out.linear.bias, Size: torch.Size([4300])
4
+ Tensor Name: norm_out.linear.weight, Size: torch.Size([4300, 3072])
5
+ Tensor Name: proj_out.bias, Size: torch.Size([44])
6
+ Tensor Name: proj_out.weight, Size: torch.Size([44, 3072])
7
+ Tensor Name: single_transformer_blocks.0.attn.norm_k.weight, Size: torch.Size([89])
8
+ Tensor Name: single_transformer_blocks.0.attn.norm_q.weight, Size: torch.Size([89])
9
+ Tensor Name: single_transformer_blocks.0.attn.to_k.bias, Size: torch.Size([2150])
10
+ Tensor Name: single_transformer_blocks.0.attn.to_k.weight, Size: torch.Size([2150, 3072])
11
+ Tensor Name: single_transformer_blocks.0.attn.to_q.bias, Size: torch.Size([2150])
12
+ Tensor Name: single_transformer_blocks.0.attn.to_q.weight, Size: torch.Size([2150, 3072])
13
+ Tensor Name: single_transformer_blocks.0.attn.to_v.bias, Size: torch.Size([2150])
14
+ Tensor Name: single_transformer_blocks.0.attn.to_v.weight, Size: torch.Size([2150, 3072])
15
+ Tensor Name: single_transformer_blocks.0.norm.linear.bias, Size: torch.Size([6451])
16
+ Tensor Name: single_transformer_blocks.0.norm.linear.weight, Size: torch.Size([6451, 3072])
17
+ Tensor Name: single_transformer_blocks.0.proj_mlp.bias, Size: torch.Size([8601])
18
+ Tensor Name: single_transformer_blocks.0.proj_mlp.weight, Size: torch.Size([8601, 3072])
19
+ Tensor Name: single_transformer_blocks.0.proj_out.bias, Size: torch.Size([2150])
20
+ Tensor Name: single_transformer_blocks.0.proj_out.weight, Size: torch.Size([2150, 15360])
21
+ Tensor Name: single_transformer_blocks.1.attn.norm_k.weight, Size: torch.Size([89])
22
+ Tensor Name: single_transformer_blocks.1.attn.norm_q.weight, Size: torch.Size([89])
23
+ Tensor Name: single_transformer_blocks.1.attn.to_k.bias, Size: torch.Size([2150])
24
+ Tensor Name: single_transformer_blocks.1.attn.to_k.weight, Size: torch.Size([2150, 3072])
25
+ Tensor Name: single_transformer_blocks.1.attn.to_q.bias, Size: torch.Size([2150])
26
+ Tensor Name: single_transformer_blocks.1.attn.to_q.weight, Size: torch.Size([2150, 3072])
27
+ Tensor Name: single_transformer_blocks.1.attn.to_v.bias, Size: torch.Size([2150])
28
+ Tensor Name: single_transformer_blocks.1.attn.to_v.weight, Size: torch.Size([2150, 3072])
29
+ Tensor Name: single_transformer_blocks.1.norm.linear.bias, Size: torch.Size([6451])
30
+ Tensor Name: single_transformer_blocks.1.norm.linear.weight, Size: torch.Size([6451, 3072])
31
+ Tensor Name: single_transformer_blocks.1.proj_mlp.bias, Size: torch.Size([8601])
32
+ Tensor Name: single_transformer_blocks.1.proj_mlp.weight, Size: torch.Size([8601, 3072])
33
+ Tensor Name: single_transformer_blocks.1.proj_out.bias, Size: torch.Size([2150])
34
+ Tensor Name: single_transformer_blocks.1.proj_out.weight, Size: torch.Size([2150, 15360])
35
+ Tensor Name: single_transformer_blocks.10.attn.norm_k.weight, Size: torch.Size([89])
36
+ Tensor Name: single_transformer_blocks.10.attn.norm_q.weight, Size: torch.Size([89])
37
+ Tensor Name: single_transformer_blocks.10.attn.to_k.bias, Size: torch.Size([2150])
38
+ Tensor Name: single_transformer_blocks.10.attn.to_k.weight, Size: torch.Size([2150, 3072])
39
+ Tensor Name: single_transformer_blocks.10.attn.to_q.bias, Size: torch.Size([2150])
40
+ Tensor Name: single_transformer_blocks.10.attn.to_q.weight, Size: torch.Size([2150, 3072])
41
+ Tensor Name: single_transformer_blocks.10.attn.to_v.bias, Size: torch.Size([2150])
42
+ Tensor Name: single_transformer_blocks.10.attn.to_v.weight, Size: torch.Size([2150, 3072])
43
+ Tensor Name: single_transformer_blocks.10.norm.linear.bias, Size: torch.Size([6451])
44
+ Tensor Name: single_transformer_blocks.10.norm.linear.weight, Size: torch.Size([6451, 3072])
45
+ Tensor Name: single_transformer_blocks.10.proj_mlp.bias, Size: torch.Size([8601])
46
+ Tensor Name: single_transformer_blocks.10.proj_mlp.weight, Size: torch.Size([8601, 3072])
47
+ Tensor Name: single_transformer_blocks.10.proj_out.bias, Size: torch.Size([2150])
48
+ Tensor Name: single_transformer_blocks.10.proj_out.weight, Size: torch.Size([2150, 15360])
49
+ Tensor Name: single_transformer_blocks.11.attn.norm_k.weight, Size: torch.Size([89])
50
+ Tensor Name: single_transformer_blocks.11.attn.norm_q.weight, Size: torch.Size([89])
51
+ Tensor Name: single_transformer_blocks.11.attn.to_k.bias, Size: torch.Size([2150])
52
+ Tensor Name: single_transformer_blocks.11.attn.to_k.weight, Size: torch.Size([2150, 3072])
53
+ Tensor Name: single_transformer_blocks.11.attn.to_q.bias, Size: torch.Size([2150])
54
+ Tensor Name: single_transformer_blocks.11.attn.to_q.weight, Size: torch.Size([2150, 3072])
55
+ Tensor Name: single_transformer_blocks.11.attn.to_v.bias, Size: torch.Size([2150])
56
+ Tensor Name: single_transformer_blocks.11.attn.to_v.weight, Size: torch.Size([2150, 3072])
57
+ Tensor Name: single_transformer_blocks.11.norm.linear.bias, Size: torch.Size([6451])
58
+ Tensor Name: single_transformer_blocks.11.norm.linear.weight, Size: torch.Size([6451, 3072])
59
+ Tensor Name: single_transformer_blocks.11.proj_mlp.bias, Size: torch.Size([8601])
60
+ Tensor Name: single_transformer_blocks.11.proj_mlp.weight, Size: torch.Size([8601, 3072])
61
+ Tensor Name: single_transformer_blocks.11.proj_out.bias, Size: torch.Size([2150])
62
+ Tensor Name: single_transformer_blocks.11.proj_out.weight, Size: torch.Size([2150, 15360])
63
+ Tensor Name: single_transformer_blocks.12.attn.norm_k.weight, Size: torch.Size([89])
64
+ Tensor Name: single_transformer_blocks.12.attn.norm_q.weight, Size: torch.Size([89])
65
+ Tensor Name: single_transformer_blocks.12.attn.to_k.bias, Size: torch.Size([2150])
66
+ Tensor Name: single_transformer_blocks.12.attn.to_k.weight, Size: torch.Size([2150, 3072])
67
+ Tensor Name: single_transformer_blocks.12.attn.to_q.bias, Size: torch.Size([2150])
68
+ Tensor Name: single_transformer_blocks.12.attn.to_q.weight, Size: torch.Size([2150, 3072])
69
+ Tensor Name: single_transformer_blocks.12.attn.to_v.bias, Size: torch.Size([2150])
70
+ Tensor Name: single_transformer_blocks.12.attn.to_v.weight, Size: torch.Size([2150, 3072])
71
+ Tensor Name: single_transformer_blocks.12.norm.linear.bias, Size: torch.Size([6451])
72
+ Tensor Name: single_transformer_blocks.12.norm.linear.weight, Size: torch.Size([6451, 3072])
73
+ Tensor Name: single_transformer_blocks.12.proj_mlp.bias, Size: torch.Size([8601])
74
+ Tensor Name: single_transformer_blocks.12.proj_mlp.weight, Size: torch.Size([8601, 3072])
75
+ Tensor Name: single_transformer_blocks.12.proj_out.bias, Size: torch.Size([2150])
76
+ Tensor Name: single_transformer_blocks.12.proj_out.weight, Size: torch.Size([2150, 15360])
77
+ Tensor Name: single_transformer_blocks.13.attn.norm_k.weight, Size: torch.Size([89])
78
+ Tensor Name: single_transformer_blocks.13.attn.norm_q.weight, Size: torch.Size([89])
79
+ Tensor Name: single_transformer_blocks.13.attn.to_k.bias, Size: torch.Size([2150])
80
+ Tensor Name: single_transformer_blocks.13.attn.to_k.weight, Size: torch.Size([2150, 3072])
81
+ Tensor Name: single_transformer_blocks.13.attn.to_q.bias, Size: torch.Size([2150])
82
+ Tensor Name: single_transformer_blocks.13.attn.to_q.weight, Size: torch.Size([2150, 3072])
83
+ Tensor Name: single_transformer_blocks.13.attn.to_v.bias, Size: torch.Size([2150])
84
+ Tensor Name: single_transformer_blocks.13.attn.to_v.weight, Size: torch.Size([2150, 3072])
85
+ Tensor Name: single_transformer_blocks.13.norm.linear.bias, Size: torch.Size([6451])
86
+ Tensor Name: single_transformer_blocks.13.norm.linear.weight, Size: torch.Size([6451, 3072])
87
+ Tensor Name: single_transformer_blocks.13.proj_mlp.bias, Size: torch.Size([8601])
88
+ Tensor Name: single_transformer_blocks.13.proj_mlp.weight, Size: torch.Size([8601, 3072])
89
+ Tensor Name: single_transformer_blocks.13.proj_out.bias, Size: torch.Size([2150])
90
+ Tensor Name: single_transformer_blocks.13.proj_out.weight, Size: torch.Size([2150, 15360])
91
+ Tensor Name: single_transformer_blocks.14.attn.norm_k.weight, Size: torch.Size([89])
92
+ Tensor Name: single_transformer_blocks.14.attn.norm_q.weight, Size: torch.Size([89])
93
+ Tensor Name: single_transformer_blocks.14.attn.to_k.bias, Size: torch.Size([2150])
94
+ Tensor Name: single_transformer_blocks.14.attn.to_k.weight, Size: torch.Size([2150, 3072])
95
+ Tensor Name: single_transformer_blocks.14.attn.to_q.bias, Size: torch.Size([2150])
96
+ Tensor Name: single_transformer_blocks.14.attn.to_q.weight, Size: torch.Size([2150, 3072])
97
+ Tensor Name: single_transformer_blocks.14.attn.to_v.bias, Size: torch.Size([2150])
98
+ Tensor Name: single_transformer_blocks.14.attn.to_v.weight, Size: torch.Size([2150, 3072])
99
+ Tensor Name: single_transformer_blocks.14.norm.linear.bias, Size: torch.Size([6451])
100
+ Tensor Name: single_transformer_blocks.14.norm.linear.weight, Size: torch.Size([6451, 3072])
101
+ Tensor Name: single_transformer_blocks.14.proj_mlp.bias, Size: torch.Size([8601])
102
+ Tensor Name: single_transformer_blocks.14.proj_mlp.weight, Size: torch.Size([8601, 3072])
103
+ Tensor Name: single_transformer_blocks.14.proj_out.bias, Size: torch.Size([2150])
104
+ Tensor Name: single_transformer_blocks.14.proj_out.weight, Size: torch.Size([2150, 15360])
105
+ Tensor Name: single_transformer_blocks.15.attn.norm_k.weight, Size: torch.Size([89])
106
+ Tensor Name: single_transformer_blocks.15.attn.norm_q.weight, Size: torch.Size([89])
107
+ Tensor Name: single_transformer_blocks.15.attn.to_k.bias, Size: torch.Size([2150])
108
+ Tensor Name: single_transformer_blocks.15.attn.to_k.weight, Size: torch.Size([2150, 3072])
109
+ Tensor Name: single_transformer_blocks.15.attn.to_q.bias, Size: torch.Size([2150])
110
+ Tensor Name: single_transformer_blocks.15.attn.to_q.weight, Size: torch.Size([2150, 3072])
111
+ Tensor Name: single_transformer_blocks.15.attn.to_v.bias, Size: torch.Size([2150])
112
+ Tensor Name: single_transformer_blocks.15.attn.to_v.weight, Size: torch.Size([2150, 3072])
113
+ Tensor Name: single_transformer_blocks.15.norm.linear.bias, Size: torch.Size([6451])
114
+ Tensor Name: single_transformer_blocks.15.norm.linear.weight, Size: torch.Size([6451, 3072])
115
+ Tensor Name: single_transformer_blocks.15.proj_mlp.bias, Size: torch.Size([8601])
116
+ Tensor Name: single_transformer_blocks.15.proj_mlp.weight, Size: torch.Size([8601, 3072])
117
+ Tensor Name: single_transformer_blocks.15.proj_out.bias, Size: torch.Size([2150])
118
+ Tensor Name: single_transformer_blocks.15.proj_out.weight, Size: torch.Size([2150, 15360])
119
+ Tensor Name: single_transformer_blocks.16.attn.norm_k.weight, Size: torch.Size([89])
120
+ Tensor Name: single_transformer_blocks.16.attn.norm_q.weight, Size: torch.Size([89])
121
+ Tensor Name: single_transformer_blocks.16.attn.to_k.bias, Size: torch.Size([2150])
122
+ Tensor Name: single_transformer_blocks.16.attn.to_k.weight, Size: torch.Size([2150, 3072])
123
+ Tensor Name: single_transformer_blocks.16.attn.to_q.bias, Size: torch.Size([2150])
124
+ Tensor Name: single_transformer_blocks.16.attn.to_q.weight, Size: torch.Size([2150, 3072])
125
+ Tensor Name: single_transformer_blocks.16.attn.to_v.bias, Size: torch.Size([2150])
126
+ Tensor Name: single_transformer_blocks.16.attn.to_v.weight, Size: torch.Size([2150, 3072])
127
+ Tensor Name: single_transformer_blocks.16.norm.linear.bias, Size: torch.Size([6451])
128
+ Tensor Name: single_transformer_blocks.16.norm.linear.weight, Size: torch.Size([6451, 3072])
129
+ Tensor Name: single_transformer_blocks.16.proj_mlp.bias, Size: torch.Size([8601])
130
+ Tensor Name: single_transformer_blocks.16.proj_mlp.weight, Size: torch.Size([8601, 3072])
131
+ Tensor Name: single_transformer_blocks.16.proj_out.bias, Size: torch.Size([2150])
132
+ Tensor Name: single_transformer_blocks.16.proj_out.weight, Size: torch.Size([2150, 15360])
133
+ Tensor Name: single_transformer_blocks.17.attn.norm_k.weight, Size: torch.Size([89])
134
+ Tensor Name: single_transformer_blocks.17.attn.norm_q.weight, Size: torch.Size([89])
135
+ Tensor Name: single_transformer_blocks.17.attn.to_k.bias, Size: torch.Size([2150])
136
+ Tensor Name: single_transformer_blocks.17.attn.to_k.weight, Size: torch.Size([2150, 3072])
137
+ Tensor Name: single_transformer_blocks.17.attn.to_q.bias, Size: torch.Size([2150])
138
+ Tensor Name: single_transformer_blocks.17.attn.to_q.weight, Size: torch.Size([2150, 3072])
139
+ Tensor Name: single_transformer_blocks.17.attn.to_v.bias, Size: torch.Size([2150])
140
+ Tensor Name: single_transformer_blocks.17.attn.to_v.weight, Size: torch.Size([2150, 3072])
141
+ Tensor Name: single_transformer_blocks.17.norm.linear.bias, Size: torch.Size([6451])
142
+ Tensor Name: single_transformer_blocks.17.norm.linear.weight, Size: torch.Size([6451, 3072])
143
+ Tensor Name: single_transformer_blocks.17.proj_mlp.bias, Size: torch.Size([8601])
144
+ Tensor Name: single_transformer_blocks.17.proj_mlp.weight, Size: torch.Size([8601, 3072])
145
+ Tensor Name: single_transformer_blocks.17.proj_out.bias, Size: torch.Size([2150])
146
+ Tensor Name: single_transformer_blocks.17.proj_out.weight, Size: torch.Size([2150, 15360])
147
+ Tensor Name: single_transformer_blocks.18.attn.norm_k.weight, Size: torch.Size([89])
148
+ Tensor Name: single_transformer_blocks.18.attn.norm_q.weight, Size: torch.Size([89])
149
+ Tensor Name: single_transformer_blocks.18.attn.to_k.bias, Size: torch.Size([2150])
150
+ Tensor Name: single_transformer_blocks.18.attn.to_k.weight, Size: torch.Size([2150, 3072])
151
+ Tensor Name: single_transformer_blocks.18.attn.to_q.bias, Size: torch.Size([2150])
152
+ Tensor Name: single_transformer_blocks.18.attn.to_q.weight, Size: torch.Size([2150, 3072])
153
+ Tensor Name: single_transformer_blocks.18.attn.to_v.bias, Size: torch.Size([2150])
154
+ Tensor Name: single_transformer_blocks.18.attn.to_v.weight, Size: torch.Size([2150, 3072])
155
+ Tensor Name: single_transformer_blocks.18.norm.linear.bias, Size: torch.Size([6451])
156
+ Tensor Name: single_transformer_blocks.18.norm.linear.weight, Size: torch.Size([6451, 3072])
157
+ Tensor Name: single_transformer_blocks.18.proj_mlp.bias, Size: torch.Size([8601])
158
+ Tensor Name: single_transformer_blocks.18.proj_mlp.weight, Size: torch.Size([8601, 3072])
159
+ Tensor Name: single_transformer_blocks.18.proj_out.bias, Size: torch.Size([2150])
160
+ Tensor Name: single_transformer_blocks.18.proj_out.weight, Size: torch.Size([2150, 15360])
161
+ Tensor Name: single_transformer_blocks.19.attn.norm_k.weight, Size: torch.Size([89])
162
+ Tensor Name: single_transformer_blocks.19.attn.norm_q.weight, Size: torch.Size([89])
163
+ Tensor Name: single_transformer_blocks.19.attn.to_k.bias, Size: torch.Size([2150])
164
+ Tensor Name: single_transformer_blocks.19.attn.to_k.weight, Size: torch.Size([2150, 3072])
165
+ Tensor Name: single_transformer_blocks.19.attn.to_q.bias, Size: torch.Size([2150])
166
+ Tensor Name: single_transformer_blocks.19.attn.to_q.weight, Size: torch.Size([2150, 3072])
167
+ Tensor Name: single_transformer_blocks.19.attn.to_v.bias, Size: torch.Size([2150])
168
+ Tensor Name: single_transformer_blocks.19.attn.to_v.weight, Size: torch.Size([2150, 3072])
169
+ Tensor Name: single_transformer_blocks.19.norm.linear.bias, Size: torch.Size([6451])
170
+ Tensor Name: single_transformer_blocks.19.norm.linear.weight, Size: torch.Size([6451, 3072])
171
+ Tensor Name: single_transformer_blocks.19.proj_mlp.bias, Size: torch.Size([8601])
172
+ Tensor Name: single_transformer_blocks.19.proj_mlp.weight, Size: torch.Size([8601, 3072])
173
+ Tensor Name: single_transformer_blocks.19.proj_out.bias, Size: torch.Size([2150])
174
+ Tensor Name: single_transformer_blocks.19.proj_out.weight, Size: torch.Size([2150, 15360])
175
+ Tensor Name: single_transformer_blocks.2.attn.norm_k.weight, Size: torch.Size([89])
176
+ Tensor Name: single_transformer_blocks.2.attn.norm_q.weight, Size: torch.Size([89])
177
+ Tensor Name: single_transformer_blocks.2.attn.to_k.bias, Size: torch.Size([2150])
178
+ Tensor Name: single_transformer_blocks.2.attn.to_k.weight, Size: torch.Size([2150, 3072])
179
+ Tensor Name: single_transformer_blocks.2.attn.to_q.bias, Size: torch.Size([2150])
180
+ Tensor Name: single_transformer_blocks.2.attn.to_q.weight, Size: torch.Size([2150, 3072])
181
+ Tensor Name: single_transformer_blocks.2.attn.to_v.bias, Size: torch.Size([2150])
182
+ Tensor Name: single_transformer_blocks.2.attn.to_v.weight, Size: torch.Size([2150, 3072])
183
+ Tensor Name: single_transformer_blocks.2.norm.linear.bias, Size: torch.Size([6451])
184
+ Tensor Name: single_transformer_blocks.2.norm.linear.weight, Size: torch.Size([6451, 3072])
185
+ Tensor Name: single_transformer_blocks.2.proj_mlp.bias, Size: torch.Size([8601])
186
+ Tensor Name: single_transformer_blocks.2.proj_mlp.weight, Size: torch.Size([8601, 3072])
187
+ Tensor Name: single_transformer_blocks.2.proj_out.bias, Size: torch.Size([2150])
188
+ Tensor Name: single_transformer_blocks.2.proj_out.weight, Size: torch.Size([2150, 15360])
189
+ Tensor Name: single_transformer_blocks.20.attn.norm_k.weight, Size: torch.Size([89])
190
+ Tensor Name: single_transformer_blocks.20.attn.norm_q.weight, Size: torch.Size([89])
191
+ Tensor Name: single_transformer_blocks.20.attn.to_k.bias, Size: torch.Size([2150])
192
+ Tensor Name: single_transformer_blocks.20.attn.to_k.weight, Size: torch.Size([2150, 3072])
193
+ Tensor Name: single_transformer_blocks.20.attn.to_q.bias, Size: torch.Size([2150])
194
+ Tensor Name: single_transformer_blocks.20.attn.to_q.weight, Size: torch.Size([2150, 3072])
195
+ Tensor Name: single_transformer_blocks.20.attn.to_v.bias, Size: torch.Size([2150])
196
+ Tensor Name: single_transformer_blocks.20.attn.to_v.weight, Size: torch.Size([2150, 3072])
197
+ Tensor Name: single_transformer_blocks.20.norm.linear.bias, Size: torch.Size([6451])
198
+ Tensor Name: single_transformer_blocks.20.norm.linear.weight, Size: torch.Size([6451, 3072])
199
+ Tensor Name: single_transformer_blocks.20.proj_mlp.bias, Size: torch.Size([8601])
200
+ Tensor Name: single_transformer_blocks.20.proj_mlp.weight, Size: torch.Size([8601, 3072])
201
+ Tensor Name: single_transformer_blocks.20.proj_out.bias, Size: torch.Size([2150])
202
+ Tensor Name: single_transformer_blocks.20.proj_out.weight, Size: torch.Size([2150, 15360])
203
+ Tensor Name: single_transformer_blocks.21.attn.norm_k.weight, Size: torch.Size([89])
204
+ Tensor Name: single_transformer_blocks.21.attn.norm_q.weight, Size: torch.Size([89])
205
+ Tensor Name: single_transformer_blocks.21.attn.to_k.bias, Size: torch.Size([2150])
206
+ Tensor Name: single_transformer_blocks.21.attn.to_k.weight, Size: torch.Size([2150, 3072])
207
+ Tensor Name: single_transformer_blocks.21.attn.to_q.bias, Size: torch.Size([2150])
208
+ Tensor Name: single_transformer_blocks.21.attn.to_q.weight, Size: torch.Size([2150, 3072])
209
+ Tensor Name: single_transformer_blocks.21.attn.to_v.bias, Size: torch.Size([2150])
210
+ Tensor Name: single_transformer_blocks.21.attn.to_v.weight, Size: torch.Size([2150, 3072])
211
+ Tensor Name: single_transformer_blocks.21.norm.linear.bias, Size: torch.Size([6451])
212
+ Tensor Name: single_transformer_blocks.21.norm.linear.weight, Size: torch.Size([6451, 3072])
213
+ Tensor Name: single_transformer_blocks.21.proj_mlp.bias, Size: torch.Size([8601])
214
+ Tensor Name: single_transformer_blocks.21.proj_mlp.weight, Size: torch.Size([8601, 3072])
215
+ Tensor Name: single_transformer_blocks.21.proj_out.bias, Size: torch.Size([2150])
216
+ Tensor Name: single_transformer_blocks.21.proj_out.weight, Size: torch.Size([2150, 15360])
217
+ Tensor Name: single_transformer_blocks.22.attn.norm_k.weight, Size: torch.Size([89])
218
+ Tensor Name: single_transformer_blocks.22.attn.norm_q.weight, Size: torch.Size([89])
219
+ Tensor Name: single_transformer_blocks.22.attn.to_k.bias, Size: torch.Size([2150])
220
+ Tensor Name: single_transformer_blocks.22.attn.to_k.weight, Size: torch.Size([2150, 3072])
221
+ Tensor Name: single_transformer_blocks.22.attn.to_q.bias, Size: torch.Size([2150])
222
+ Tensor Name: single_transformer_blocks.22.attn.to_q.weight, Size: torch.Size([2150, 3072])
223
+ Tensor Name: single_transformer_blocks.22.attn.to_v.bias, Size: torch.Size([2150])
224
+ Tensor Name: single_transformer_blocks.22.attn.to_v.weight, Size: torch.Size([2150, 3072])
225
+ Tensor Name: single_transformer_blocks.22.norm.linear.bias, Size: torch.Size([6451])
226
+ Tensor Name: single_transformer_blocks.22.norm.linear.weight, Size: torch.Size([6451, 3072])
227
+ Tensor Name: single_transformer_blocks.22.proj_mlp.bias, Size: torch.Size([8601])
228
+ Tensor Name: single_transformer_blocks.22.proj_mlp.weight, Size: torch.Size([8601, 3072])
229
+ Tensor Name: single_transformer_blocks.22.proj_out.bias, Size: torch.Size([2150])
230
+ Tensor Name: single_transformer_blocks.22.proj_out.weight, Size: torch.Size([2150, 15360])
231
+ Tensor Name: single_transformer_blocks.23.attn.norm_k.weight, Size: torch.Size([89])
232
+ Tensor Name: single_transformer_blocks.23.attn.norm_q.weight, Size: torch.Size([89])
233
+ Tensor Name: single_transformer_blocks.23.attn.to_k.bias, Size: torch.Size([2150])
234
+ Tensor Name: single_transformer_blocks.23.attn.to_k.weight, Size: torch.Size([2150, 3072])
235
+ Tensor Name: single_transformer_blocks.23.attn.to_q.bias, Size: torch.Size([2150])
236
+ Tensor Name: single_transformer_blocks.23.attn.to_q.weight, Size: torch.Size([2150, 3072])
237
+ Tensor Name: single_transformer_blocks.23.attn.to_v.bias, Size: torch.Size([2150])
238
+ Tensor Name: single_transformer_blocks.23.attn.to_v.weight, Size: torch.Size([2150, 3072])
239
+ Tensor Name: single_transformer_blocks.23.norm.linear.bias, Size: torch.Size([6451])
240
+ Tensor Name: single_transformer_blocks.23.norm.linear.weight, Size: torch.Size([6451, 3072])
241
+ Tensor Name: single_transformer_blocks.23.proj_mlp.bias, Size: torch.Size([8601])
242
+ Tensor Name: single_transformer_blocks.23.proj_mlp.weight, Size: torch.Size([8601, 3072])
243
+ Tensor Name: single_transformer_blocks.23.proj_out.bias, Size: torch.Size([2150])
244
+ Tensor Name: single_transformer_blocks.23.proj_out.weight, Size: torch.Size([2150, 15360])
245
+ Tensor Name: single_transformer_blocks.24.attn.norm_k.weight, Size: torch.Size([89])
246
+ Tensor Name: single_transformer_blocks.24.attn.norm_q.weight, Size: torch.Size([89])
247
+ Tensor Name: single_transformer_blocks.24.attn.to_k.bias, Size: torch.Size([2150])
248
+ Tensor Name: single_transformer_blocks.24.attn.to_k.weight, Size: torch.Size([2150, 3072])
249
+ Tensor Name: single_transformer_blocks.24.attn.to_q.bias, Size: torch.Size([2150])
250
+ Tensor Name: single_transformer_blocks.24.attn.to_q.weight, Size: torch.Size([2150, 3072])
251
+ Tensor Name: single_transformer_blocks.24.attn.to_v.bias, Size: torch.Size([2150])
252
+ Tensor Name: single_transformer_blocks.24.attn.to_v.weight, Size: torch.Size([2150, 3072])
253
+ Tensor Name: single_transformer_blocks.24.norm.linear.bias, Size: torch.Size([6451])
254
+ Tensor Name: single_transformer_blocks.24.norm.linear.weight, Size: torch.Size([6451, 3072])
255
+ Tensor Name: single_transformer_blocks.24.proj_mlp.bias, Size: torch.Size([8601])
256
+ Tensor Name: single_transformer_blocks.24.proj_mlp.weight, Size: torch.Size([8601, 3072])
257
+ Tensor Name: single_transformer_blocks.24.proj_out.bias, Size: torch.Size([2150])
258
+ Tensor Name: single_transformer_blocks.24.proj_out.weight, Size: torch.Size([2150, 15360])
259
+ Tensor Name: single_transformer_blocks.25.attn.norm_k.weight, Size: torch.Size([89])
260
+ Tensor Name: single_transformer_blocks.25.attn.norm_q.weight, Size: torch.Size([89])
261
+ Tensor Name: single_transformer_blocks.25.attn.to_k.bias, Size: torch.Size([2150])
262
+ Tensor Name: single_transformer_blocks.25.attn.to_k.weight, Size: torch.Size([2150, 3072])
263
+ Tensor Name: single_transformer_blocks.25.attn.to_q.bias, Size: torch.Size([2150])
264
+ Tensor Name: single_transformer_blocks.25.attn.to_q.weight, Size: torch.Size([2150, 3072])
265
+ Tensor Name: single_transformer_blocks.25.attn.to_v.bias, Size: torch.Size([2150])
266
+ Tensor Name: single_transformer_blocks.25.attn.to_v.weight, Size: torch.Size([2150, 3072])
267
+ Tensor Name: single_transformer_blocks.25.norm.linear.bias, Size: torch.Size([6451])
268
+ Tensor Name: single_transformer_blocks.25.norm.linear.weight, Size: torch.Size([6451, 3072])
269
+ Tensor Name: single_transformer_blocks.25.proj_mlp.bias, Size: torch.Size([8601])
270
+ Tensor Name: single_transformer_blocks.25.proj_mlp.weight, Size: torch.Size([8601, 3072])
271
+ Tensor Name: single_transformer_blocks.25.proj_out.bias, Size: torch.Size([2150])
272
+ Tensor Name: single_transformer_blocks.25.proj_out.weight, Size: torch.Size([2150, 15360])
273
+ Tensor Name: single_transformer_blocks.26.attn.norm_k.weight, Size: torch.Size([89])
274
+ Tensor Name: single_transformer_blocks.26.attn.norm_q.weight, Size: torch.Size([89])
275
+ Tensor Name: single_transformer_blocks.26.attn.to_k.bias, Size: torch.Size([2150])
276
+ Tensor Name: single_transformer_blocks.26.attn.to_k.weight, Size: torch.Size([2150, 3072])
277
+ Tensor Name: single_transformer_blocks.26.attn.to_q.bias, Size: torch.Size([2150])
278
+ Tensor Name: single_transformer_blocks.26.attn.to_q.weight, Size: torch.Size([2150, 3072])
279
+ Tensor Name: single_transformer_blocks.26.attn.to_v.bias, Size: torch.Size([2150])
280
+ Tensor Name: single_transformer_blocks.26.attn.to_v.weight, Size: torch.Size([2150, 3072])
281
+ Tensor Name: single_transformer_blocks.26.norm.linear.bias, Size: torch.Size([6451])
282
+ Tensor Name: single_transformer_blocks.26.norm.linear.weight, Size: torch.Size([6451, 3072])
283
+ Tensor Name: single_transformer_blocks.26.proj_mlp.bias, Size: torch.Size([8601])
284
+ Tensor Name: single_transformer_blocks.26.proj_mlp.weight, Size: torch.Size([8601, 3072])
285
+ Tensor Name: single_transformer_blocks.26.proj_out.bias, Size: torch.Size([2150])
286
+ Tensor Name: single_transformer_blocks.26.proj_out.weight, Size: torch.Size([2150, 15360])
287
+ Tensor Name: single_transformer_blocks.27.attn.norm_k.weight, Size: torch.Size([89])
288
+ Tensor Name: single_transformer_blocks.27.attn.norm_q.weight, Size: torch.Size([89])
289
+ Tensor Name: single_transformer_blocks.27.attn.to_k.bias, Size: torch.Size([2150])
290
+ Tensor Name: single_transformer_blocks.27.attn.to_k.weight, Size: torch.Size([2150, 3072])
291
+ Tensor Name: single_transformer_blocks.27.attn.to_q.bias, Size: torch.Size([2150])
292
+ Tensor Name: single_transformer_blocks.27.attn.to_q.weight, Size: torch.Size([2150, 3072])
293
+ Tensor Name: single_transformer_blocks.27.attn.to_v.bias, Size: torch.Size([2150])
294
+ Tensor Name: single_transformer_blocks.27.attn.to_v.weight, Size: torch.Size([2150, 3072])
295
+ Tensor Name: single_transformer_blocks.27.norm.linear.bias, Size: torch.Size([6451])
296
+ Tensor Name: single_transformer_blocks.27.norm.linear.weight, Size: torch.Size([6451, 3072])
297
+ Tensor Name: single_transformer_blocks.27.proj_mlp.bias, Size: torch.Size([8601])
298
+ Tensor Name: single_transformer_blocks.27.proj_mlp.weight, Size: torch.Size([8601, 3072])
299
+ Tensor Name: single_transformer_blocks.27.proj_out.bias, Size: torch.Size([2150])
300
+ Tensor Name: single_transformer_blocks.27.proj_out.weight, Size: torch.Size([2150, 15360])
301
+ Tensor Name: single_transformer_blocks.28.attn.norm_k.weight, Size: torch.Size([89])
302
+ Tensor Name: single_transformer_blocks.28.attn.norm_q.weight, Size: torch.Size([89])
303
+ Tensor Name: single_transformer_blocks.28.attn.to_k.bias, Size: torch.Size([2150])
304
+ Tensor Name: single_transformer_blocks.28.attn.to_k.weight, Size: torch.Size([2150, 3072])
305
+ Tensor Name: single_transformer_blocks.28.attn.to_q.bias, Size: torch.Size([2150])
306
+ Tensor Name: single_transformer_blocks.28.attn.to_q.weight, Size: torch.Size([2150, 3072])
307
+ Tensor Name: single_transformer_blocks.28.attn.to_v.bias, Size: torch.Size([2150])
308
+ Tensor Name: single_transformer_blocks.28.attn.to_v.weight, Size: torch.Size([2150, 3072])
309
+ Tensor Name: single_transformer_blocks.28.norm.linear.bias, Size: torch.Size([6451])
310
+ Tensor Name: single_transformer_blocks.28.norm.linear.weight, Size: torch.Size([6451, 3072])
311
+ Tensor Name: single_transformer_blocks.28.proj_mlp.bias, Size: torch.Size([8601])
312
+ Tensor Name: single_transformer_blocks.28.proj_mlp.weight, Size: torch.Size([8601, 3072])
313
+ Tensor Name: single_transformer_blocks.28.proj_out.bias, Size: torch.Size([2150])
314
+ Tensor Name: single_transformer_blocks.28.proj_out.weight, Size: torch.Size([2150, 15360])
315
+ Tensor Name: single_transformer_blocks.29.attn.norm_k.weight, Size: torch.Size([89])
316
+ Tensor Name: single_transformer_blocks.29.attn.norm_q.weight, Size: torch.Size([89])
317
+ Tensor Name: single_transformer_blocks.29.attn.to_k.bias, Size: torch.Size([2150])
318
+ Tensor Name: single_transformer_blocks.29.attn.to_k.weight, Size: torch.Size([2150, 3072])
319
+ Tensor Name: single_transformer_blocks.29.attn.to_q.bias, Size: torch.Size([2150])
320
+ Tensor Name: single_transformer_blocks.29.attn.to_q.weight, Size: torch.Size([2150, 3072])
321
+ Tensor Name: single_transformer_blocks.29.attn.to_v.bias, Size: torch.Size([2150])
322
+ Tensor Name: single_transformer_blocks.29.attn.to_v.weight, Size: torch.Size([2150, 3072])
323
+ Tensor Name: single_transformer_blocks.29.norm.linear.bias, Size: torch.Size([6451])
324
+ Tensor Name: single_transformer_blocks.29.norm.linear.weight, Size: torch.Size([6451, 3072])
325
+ Tensor Name: single_transformer_blocks.29.proj_mlp.bias, Size: torch.Size([8601])
326
+ Tensor Name: single_transformer_blocks.29.proj_mlp.weight, Size: torch.Size([8601, 3072])
327
+ Tensor Name: single_transformer_blocks.29.proj_out.bias, Size: torch.Size([2150])
328
+ Tensor Name: single_transformer_blocks.29.proj_out.weight, Size: torch.Size([2150, 15360])
329
+ Tensor Name: single_transformer_blocks.3.attn.norm_k.weight, Size: torch.Size([89])
330
+ Tensor Name: single_transformer_blocks.3.attn.norm_q.weight, Size: torch.Size([89])
331
+ Tensor Name: single_transformer_blocks.3.attn.to_k.bias, Size: torch.Size([2150])
332
+ Tensor Name: single_transformer_blocks.3.attn.to_k.weight, Size: torch.Size([2150, 3072])
333
+ Tensor Name: single_transformer_blocks.3.attn.to_q.bias, Size: torch.Size([2150])
334
+ Tensor Name: single_transformer_blocks.3.attn.to_q.weight, Size: torch.Size([2150, 3072])
335
+ Tensor Name: single_transformer_blocks.3.attn.to_v.bias, Size: torch.Size([2150])
336
+ Tensor Name: single_transformer_blocks.3.attn.to_v.weight, Size: torch.Size([2150, 3072])
337
+ Tensor Name: single_transformer_blocks.3.norm.linear.bias, Size: torch.Size([6451])
338
+ Tensor Name: single_transformer_blocks.3.norm.linear.weight, Size: torch.Size([6451, 3072])
339
+ Tensor Name: single_transformer_blocks.3.proj_mlp.bias, Size: torch.Size([8601])
340
+ Tensor Name: single_transformer_blocks.3.proj_mlp.weight, Size: torch.Size([8601, 3072])
341
+ Tensor Name: single_transformer_blocks.3.proj_out.bias, Size: torch.Size([2150])
342
+ Tensor Name: single_transformer_blocks.3.proj_out.weight, Size: torch.Size([2150, 15360])
343
+ Tensor Name: single_transformer_blocks.30.attn.norm_k.weight, Size: torch.Size([89])
344
+ Tensor Name: single_transformer_blocks.30.attn.norm_q.weight, Size: torch.Size([89])
345
+ Tensor Name: single_transformer_blocks.30.attn.to_k.bias, Size: torch.Size([2150])
346
+ Tensor Name: single_transformer_blocks.30.attn.to_k.weight, Size: torch.Size([2150, 3072])
347
+ Tensor Name: single_transformer_blocks.30.attn.to_q.bias, Size: torch.Size([2150])
348
+ Tensor Name: single_transformer_blocks.30.attn.to_q.weight, Size: torch.Size([2150, 3072])
349
+ Tensor Name: single_transformer_blocks.30.attn.to_v.bias, Size: torch.Size([2150])
350
+ Tensor Name: single_transformer_blocks.30.attn.to_v.weight, Size: torch.Size([2150, 3072])
351
+ Tensor Name: single_transformer_blocks.30.norm.linear.bias, Size: torch.Size([6451])
352
+ Tensor Name: single_transformer_blocks.30.norm.linear.weight, Size: torch.Size([6451, 3072])
353
+ Tensor Name: single_transformer_blocks.30.proj_mlp.bias, Size: torch.Size([8601])
354
+ Tensor Name: single_transformer_blocks.30.proj_mlp.weight, Size: torch.Size([8601, 3072])
355
+ Tensor Name: single_transformer_blocks.30.proj_out.bias, Size: torch.Size([2150])
356
+ Tensor Name: single_transformer_blocks.30.proj_out.weight, Size: torch.Size([2150, 15360])
357
+ Tensor Name: single_transformer_blocks.31.attn.norm_k.weight, Size: torch.Size([89])
358
+ Tensor Name: single_transformer_blocks.31.attn.norm_q.weight, Size: torch.Size([89])
359
+ Tensor Name: single_transformer_blocks.31.attn.to_k.bias, Size: torch.Size([2150])
360
+ Tensor Name: single_transformer_blocks.31.attn.to_k.weight, Size: torch.Size([2150, 3072])
361
+ Tensor Name: single_transformer_blocks.31.attn.to_q.bias, Size: torch.Size([2150])
362
+ Tensor Name: single_transformer_blocks.31.attn.to_q.weight, Size: torch.Size([2150, 3072])
363
+ Tensor Name: single_transformer_blocks.31.attn.to_v.bias, Size: torch.Size([2150])
364
+ Tensor Name: single_transformer_blocks.31.attn.to_v.weight, Size: torch.Size([2150, 3072])
365
+ Tensor Name: single_transformer_blocks.31.norm.linear.bias, Size: torch.Size([6451])
366
+ Tensor Name: single_transformer_blocks.31.norm.linear.weight, Size: torch.Size([6451, 3072])
367
+ Tensor Name: single_transformer_blocks.31.proj_mlp.bias, Size: torch.Size([8601])
368
+ Tensor Name: single_transformer_blocks.31.proj_mlp.weight, Size: torch.Size([8601, 3072])
369
+ Tensor Name: single_transformer_blocks.31.proj_out.bias, Size: torch.Size([2150])
370
+ Tensor Name: single_transformer_blocks.31.proj_out.weight, Size: torch.Size([2150, 15360])
371
+ Tensor Name: single_transformer_blocks.32.attn.norm_k.weight, Size: torch.Size([89])
372
+ Tensor Name: single_transformer_blocks.32.attn.norm_q.weight, Size: torch.Size([89])
373
+ Tensor Name: single_transformer_blocks.32.attn.to_k.bias, Size: torch.Size([2150])
374
+ Tensor Name: single_transformer_blocks.32.attn.to_k.weight, Size: torch.Size([2150, 3072])
375
+ Tensor Name: single_transformer_blocks.32.attn.to_q.bias, Size: torch.Size([2150])
376
+ Tensor Name: single_transformer_blocks.32.attn.to_q.weight, Size: torch.Size([2150, 3072])
377
+ Tensor Name: single_transformer_blocks.32.attn.to_v.bias, Size: torch.Size([2150])
378
+ Tensor Name: single_transformer_blocks.32.attn.to_v.weight, Size: torch.Size([2150, 3072])
379
+ Tensor Name: single_transformer_blocks.32.norm.linear.bias, Size: torch.Size([6451])
380
+ Tensor Name: single_transformer_blocks.32.norm.linear.weight, Size: torch.Size([6451, 3072])
381
+ Tensor Name: single_transformer_blocks.32.proj_mlp.bias, Size: torch.Size([8601])
382
+ Tensor Name: single_transformer_blocks.32.proj_mlp.weight, Size: torch.Size([8601, 3072])
383
+ Tensor Name: single_transformer_blocks.32.proj_out.bias, Size: torch.Size([2150])
384
+ Tensor Name: single_transformer_blocks.32.proj_out.weight, Size: torch.Size([2150, 15360])
385
+ Tensor Name: single_transformer_blocks.33.attn.norm_k.weight, Size: torch.Size([89])
386
+ Tensor Name: single_transformer_blocks.33.attn.norm_q.weight, Size: torch.Size([89])
387
+ Tensor Name: single_transformer_blocks.33.attn.to_k.bias, Size: torch.Size([2150])
388
+ Tensor Name: single_transformer_blocks.33.attn.to_k.weight, Size: torch.Size([2150, 3072])
389
+ Tensor Name: single_transformer_blocks.33.attn.to_q.bias, Size: torch.Size([2150])
390
+ Tensor Name: single_transformer_blocks.33.attn.to_q.weight, Size: torch.Size([2150, 3072])
391
+ Tensor Name: single_transformer_blocks.33.attn.to_v.bias, Size: torch.Size([2150])
392
+ Tensor Name: single_transformer_blocks.33.attn.to_v.weight, Size: torch.Size([2150, 3072])
393
+ Tensor Name: single_transformer_blocks.33.norm.linear.bias, Size: torch.Size([6451])
394
+ Tensor Name: single_transformer_blocks.33.norm.linear.weight, Size: torch.Size([6451, 3072])
395
+ Tensor Name: single_transformer_blocks.33.proj_mlp.bias, Size: torch.Size([8601])
396
+ Tensor Name: single_transformer_blocks.33.proj_mlp.weight, Size: torch.Size([8601, 3072])
397
+ Tensor Name: single_transformer_blocks.33.proj_out.bias, Size: torch.Size([2150])
398
+ Tensor Name: single_transformer_blocks.33.proj_out.weight, Size: torch.Size([2150, 15360])
399
+ Tensor Name: single_transformer_blocks.34.attn.norm_k.weight, Size: torch.Size([89])
400
+ Tensor Name: single_transformer_blocks.34.attn.norm_q.weight, Size: torch.Size([89])
401
+ Tensor Name: single_transformer_blocks.34.attn.to_k.bias, Size: torch.Size([2150])
402
+ Tensor Name: single_transformer_blocks.34.attn.to_k.weight, Size: torch.Size([2150, 3072])
403
+ Tensor Name: single_transformer_blocks.34.attn.to_q.bias, Size: torch.Size([2150])
404
+ Tensor Name: single_transformer_blocks.34.attn.to_q.weight, Size: torch.Size([2150, 3072])
405
+ Tensor Name: single_transformer_blocks.34.attn.to_v.bias, Size: torch.Size([2150])
406
+ Tensor Name: single_transformer_blocks.34.attn.to_v.weight, Size: torch.Size([2150, 3072])
407
+ Tensor Name: single_transformer_blocks.34.norm.linear.bias, Size: torch.Size([6451])
408
+ Tensor Name: single_transformer_blocks.34.norm.linear.weight, Size: torch.Size([6451, 3072])
409
+ Tensor Name: single_transformer_blocks.34.proj_mlp.bias, Size: torch.Size([8601])
410
+ Tensor Name: single_transformer_blocks.34.proj_mlp.weight, Size: torch.Size([8601, 3072])
411
+ Tensor Name: single_transformer_blocks.34.proj_out.bias, Size: torch.Size([2150])
412
+ Tensor Name: single_transformer_blocks.34.proj_out.weight, Size: torch.Size([2150, 15360])
413
+ Tensor Name: single_transformer_blocks.35.attn.norm_k.weight, Size: torch.Size([89])
414
+ Tensor Name: single_transformer_blocks.35.attn.norm_q.weight, Size: torch.Size([89])
415
+ Tensor Name: single_transformer_blocks.35.attn.to_k.bias, Size: torch.Size([2150])
416
+ Tensor Name: single_transformer_blocks.35.attn.to_k.weight, Size: torch.Size([2150, 3072])
417
+ Tensor Name: single_transformer_blocks.35.attn.to_q.bias, Size: torch.Size([2150])
418
+ Tensor Name: single_transformer_blocks.35.attn.to_q.weight, Size: torch.Size([2150, 3072])
419
+ Tensor Name: single_transformer_blocks.35.attn.to_v.bias, Size: torch.Size([2150])
420
+ Tensor Name: single_transformer_blocks.35.attn.to_v.weight, Size: torch.Size([2150, 3072])
421
+ Tensor Name: single_transformer_blocks.35.norm.linear.bias, Size: torch.Size([6451])
422
+ Tensor Name: single_transformer_blocks.35.norm.linear.weight, Size: torch.Size([6451, 3072])
423
+ Tensor Name: single_transformer_blocks.35.proj_mlp.bias, Size: torch.Size([8601])
424
+ Tensor Name: single_transformer_blocks.35.proj_mlp.weight, Size: torch.Size([8601, 3072])
425
+ Tensor Name: single_transformer_blocks.35.proj_out.bias, Size: torch.Size([2150])
426
+ Tensor Name: single_transformer_blocks.35.proj_out.weight, Size: torch.Size([2150, 15360])
427
+ Tensor Name: single_transformer_blocks.36.attn.norm_k.weight, Size: torch.Size([89])
428
+ Tensor Name: single_transformer_blocks.36.attn.norm_q.weight, Size: torch.Size([89])
429
+ Tensor Name: single_transformer_blocks.36.attn.to_k.bias, Size: torch.Size([2150])
430
+ Tensor Name: single_transformer_blocks.36.attn.to_k.weight, Size: torch.Size([2150, 3072])
431
+ Tensor Name: single_transformer_blocks.36.attn.to_q.bias, Size: torch.Size([2150])
432
+ Tensor Name: single_transformer_blocks.36.attn.to_q.weight, Size: torch.Size([2150, 3072])
433
+ Tensor Name: single_transformer_blocks.36.attn.to_v.bias, Size: torch.Size([2150])
434
+ Tensor Name: single_transformer_blocks.36.attn.to_v.weight, Size: torch.Size([2150, 3072])
435
+ Tensor Name: single_transformer_blocks.36.norm.linear.bias, Size: torch.Size([6451])
436
+ Tensor Name: single_transformer_blocks.36.norm.linear.weight, Size: torch.Size([6451, 3072])
437
+ Tensor Name: single_transformer_blocks.36.proj_mlp.bias, Size: torch.Size([8601])
438
+ Tensor Name: single_transformer_blocks.36.proj_mlp.weight, Size: torch.Size([8601, 3072])
439
+ Tensor Name: single_transformer_blocks.36.proj_out.bias, Size: torch.Size([2150])
440
+ Tensor Name: single_transformer_blocks.36.proj_out.weight, Size: torch.Size([2150, 15360])
441
+ Tensor Name: single_transformer_blocks.37.attn.norm_k.weight, Size: torch.Size([89])
442
+ Tensor Name: single_transformer_blocks.37.attn.norm_q.weight, Size: torch.Size([89])
443
+ Tensor Name: single_transformer_blocks.37.attn.to_k.bias, Size: torch.Size([2150])
444
+ Tensor Name: single_transformer_blocks.37.attn.to_k.weight, Size: torch.Size([2150, 3072])
445
+ Tensor Name: single_transformer_blocks.37.attn.to_q.bias, Size: torch.Size([2150])
446
+ Tensor Name: single_transformer_blocks.37.attn.to_q.weight, Size: torch.Size([2150, 3072])
447
+ Tensor Name: single_transformer_blocks.37.attn.to_v.bias, Size: torch.Size([2150])
448
+ Tensor Name: single_transformer_blocks.37.attn.to_v.weight, Size: torch.Size([2150, 3072])
449
+ Tensor Name: single_transformer_blocks.37.norm.linear.bias, Size: torch.Size([6451])
450
+ Tensor Name: single_transformer_blocks.37.norm.linear.weight, Size: torch.Size([6451, 3072])
451
+ Tensor Name: single_transformer_blocks.37.proj_mlp.bias, Size: torch.Size([8601])
452
+ Tensor Name: single_transformer_blocks.37.proj_mlp.weight, Size: torch.Size([8601, 3072])
453
+ Tensor Name: single_transformer_blocks.37.proj_out.bias, Size: torch.Size([2150])
454
+ Tensor Name: single_transformer_blocks.37.proj_out.weight, Size: torch.Size([2150, 15360])
455
+ Tensor Name: single_transformer_blocks.4.attn.norm_k.weight, Size: torch.Size([89])
456
+ Tensor Name: single_transformer_blocks.4.attn.norm_q.weight, Size: torch.Size([89])
457
+ Tensor Name: single_transformer_blocks.4.attn.to_k.bias, Size: torch.Size([2150])
458
+ Tensor Name: single_transformer_blocks.4.attn.to_k.weight, Size: torch.Size([2150, 3072])
459
+ Tensor Name: single_transformer_blocks.4.attn.to_q.bias, Size: torch.Size([2150])
460
+ Tensor Name: single_transformer_blocks.4.attn.to_q.weight, Size: torch.Size([2150, 3072])
461
+ Tensor Name: single_transformer_blocks.4.attn.to_v.bias, Size: torch.Size([2150])
462
+ Tensor Name: single_transformer_blocks.4.attn.to_v.weight, Size: torch.Size([2150, 3072])
463
+ Tensor Name: single_transformer_blocks.4.norm.linear.bias, Size: torch.Size([6451])
464
+ Tensor Name: single_transformer_blocks.4.norm.linear.weight, Size: torch.Size([6451, 3072])
465
+ Tensor Name: single_transformer_blocks.4.proj_mlp.bias, Size: torch.Size([8601])
466
+ Tensor Name: single_transformer_blocks.4.proj_mlp.weight, Size: torch.Size([8601, 3072])
467
+ Tensor Name: single_transformer_blocks.4.proj_out.bias, Size: torch.Size([2150])
468
+ Tensor Name: single_transformer_blocks.4.proj_out.weight, Size: torch.Size([2150, 15360])
469
+ Tensor Name: single_transformer_blocks.5.attn.norm_k.weight, Size: torch.Size([89])
470
+ Tensor Name: single_transformer_blocks.5.attn.norm_q.weight, Size: torch.Size([89])
471
+ Tensor Name: single_transformer_blocks.5.attn.to_k.bias, Size: torch.Size([2150])
472
+ Tensor Name: single_transformer_blocks.5.attn.to_k.weight, Size: torch.Size([2150, 3072])
473
+ Tensor Name: single_transformer_blocks.5.attn.to_q.bias, Size: torch.Size([2150])
474
+ Tensor Name: single_transformer_blocks.5.attn.to_q.weight, Size: torch.Size([2150, 3072])
475
+ Tensor Name: single_transformer_blocks.5.attn.to_v.bias, Size: torch.Size([2150])
476
+ Tensor Name: single_transformer_blocks.5.attn.to_v.weight, Size: torch.Size([2150, 3072])
477
+ Tensor Name: single_transformer_blocks.5.norm.linear.bias, Size: torch.Size([6451])
478
+ Tensor Name: single_transformer_blocks.5.norm.linear.weight, Size: torch.Size([6451, 3072])
479
+ Tensor Name: single_transformer_blocks.5.proj_mlp.bias, Size: torch.Size([8601])
480
+ Tensor Name: single_transformer_blocks.5.proj_mlp.weight, Size: torch.Size([8601, 3072])
481
+ Tensor Name: single_transformer_blocks.5.proj_out.bias, Size: torch.Size([2150])
482
+ Tensor Name: single_transformer_blocks.5.proj_out.weight, Size: torch.Size([2150, 15360])
483
+ Tensor Name: single_transformer_blocks.6.attn.norm_k.weight, Size: torch.Size([89])
484
+ Tensor Name: single_transformer_blocks.6.attn.norm_q.weight, Size: torch.Size([89])
485
+ Tensor Name: single_transformer_blocks.6.attn.to_k.bias, Size: torch.Size([2150])
486
+ Tensor Name: single_transformer_blocks.6.attn.to_k.weight, Size: torch.Size([2150, 3072])
487
+ Tensor Name: single_transformer_blocks.6.attn.to_q.bias, Size: torch.Size([2150])
488
+ Tensor Name: single_transformer_blocks.6.attn.to_q.weight, Size: torch.Size([2150, 3072])
489
+ Tensor Name: single_transformer_blocks.6.attn.to_v.bias, Size: torch.Size([2150])
490
+ Tensor Name: single_transformer_blocks.6.attn.to_v.weight, Size: torch.Size([2150, 3072])
491
+ Tensor Name: single_transformer_blocks.6.norm.linear.bias, Size: torch.Size([6451])
492
+ Tensor Name: single_transformer_blocks.6.norm.linear.weight, Size: torch.Size([6451, 3072])
493
+ Tensor Name: single_transformer_blocks.6.proj_mlp.bias, Size: torch.Size([8601])
494
+ Tensor Name: single_transformer_blocks.6.proj_mlp.weight, Size: torch.Size([8601, 3072])
495
+ Tensor Name: single_transformer_blocks.6.proj_out.bias, Size: torch.Size([2150])
496
+ Tensor Name: single_transformer_blocks.6.proj_out.weight, Size: torch.Size([2150, 15360])
497
+ Tensor Name: single_transformer_blocks.7.attn.norm_k.weight, Size: torch.Size([89])
498
+ Tensor Name: single_transformer_blocks.7.attn.norm_q.weight, Size: torch.Size([89])
499
+ Tensor Name: single_transformer_blocks.7.attn.to_k.bias, Size: torch.Size([2150])
500
+ Tensor Name: single_transformer_blocks.7.attn.to_k.weight, Size: torch.Size([2150, 3072])
501
+ Tensor Name: single_transformer_blocks.7.attn.to_q.bias, Size: torch.Size([2150])
502
+ Tensor Name: single_transformer_blocks.7.attn.to_q.weight, Size: torch.Size([2150, 3072])
503
+ Tensor Name: single_transformer_blocks.7.attn.to_v.bias, Size: torch.Size([2150])
504
+ Tensor Name: single_transformer_blocks.7.attn.to_v.weight, Size: torch.Size([2150, 3072])
505
+ Tensor Name: single_transformer_blocks.7.norm.linear.bias, Size: torch.Size([6451])
506
+ Tensor Name: single_transformer_blocks.7.norm.linear.weight, Size: torch.Size([6451, 3072])
507
+ Tensor Name: single_transformer_blocks.7.proj_mlp.bias, Size: torch.Size([8601])
508
+ Tensor Name: single_transformer_blocks.7.proj_mlp.weight, Size: torch.Size([8601, 3072])
509
+ Tensor Name: single_transformer_blocks.7.proj_out.bias, Size: torch.Size([2150])
510
+ Tensor Name: single_transformer_blocks.7.proj_out.weight, Size: torch.Size([2150, 15360])
511
+ Tensor Name: single_transformer_blocks.8.attn.norm_k.weight, Size: torch.Size([89])
512
+ Tensor Name: single_transformer_blocks.8.attn.norm_q.weight, Size: torch.Size([89])
513
+ Tensor Name: single_transformer_blocks.8.attn.to_k.bias, Size: torch.Size([2150])
514
+ Tensor Name: single_transformer_blocks.8.attn.to_k.weight, Size: torch.Size([2150, 3072])
515
+ Tensor Name: single_transformer_blocks.8.attn.to_q.bias, Size: torch.Size([2150])
516
+ Tensor Name: single_transformer_blocks.8.attn.to_q.weight, Size: torch.Size([2150, 3072])
517
+ Tensor Name: single_transformer_blocks.8.attn.to_v.bias, Size: torch.Size([2150])
518
+ Tensor Name: single_transformer_blocks.8.attn.to_v.weight, Size: torch.Size([2150, 3072])
519
+ Tensor Name: single_transformer_blocks.8.norm.linear.bias, Size: torch.Size([6451])
520
+ Tensor Name: single_transformer_blocks.8.norm.linear.weight, Size: torch.Size([6451, 3072])
521
+ Tensor Name: single_transformer_blocks.8.proj_mlp.bias, Size: torch.Size([8601])
522
+ Tensor Name: single_transformer_blocks.8.proj_mlp.weight, Size: torch.Size([8601, 3072])
523
+ Tensor Name: single_transformer_blocks.8.proj_out.bias, Size: torch.Size([2150])
524
+ Tensor Name: single_transformer_blocks.8.proj_out.weight, Size: torch.Size([2150, 15360])
525
+ Tensor Name: single_transformer_blocks.9.attn.norm_k.weight, Size: torch.Size([89])
526
+ Tensor Name: single_transformer_blocks.9.attn.norm_q.weight, Size: torch.Size([89])
527
+ Tensor Name: single_transformer_blocks.9.attn.to_k.bias, Size: torch.Size([2150])
528
+ Tensor Name: single_transformer_blocks.9.attn.to_k.weight, Size: torch.Size([2150, 3072])
529
+ Tensor Name: single_transformer_blocks.9.attn.to_q.bias, Size: torch.Size([2150])
530
+ Tensor Name: single_transformer_blocks.9.attn.to_q.weight, Size: torch.Size([2150, 3072])
531
+ Tensor Name: single_transformer_blocks.9.attn.to_v.bias, Size: torch.Size([2150])
532
+ Tensor Name: single_transformer_blocks.9.attn.to_v.weight, Size: torch.Size([2150, 3072])
533
+ Tensor Name: single_transformer_blocks.9.norm.linear.bias, Size: torch.Size([6451])
534
+ Tensor Name: single_transformer_blocks.9.norm.linear.weight, Size: torch.Size([6451, 3072])
535
+ Tensor Name: single_transformer_blocks.9.proj_mlp.bias, Size: torch.Size([8601])
536
+ Tensor Name: single_transformer_blocks.9.proj_mlp.weight, Size: torch.Size([8601, 3072])
537
+ Tensor Name: single_transformer_blocks.9.proj_out.bias, Size: torch.Size([2150])
538
+ Tensor Name: single_transformer_blocks.9.proj_out.weight, Size: torch.Size([2150, 15360])
539
+ Tensor Name: time_text_embed.guidance_embedder.linear_1.bias, Size: torch.Size([2150])
540
+ Tensor Name: time_text_embed.guidance_embedder.linear_1.weight, Size: torch.Size([2150, 256])
541
+ Tensor Name: time_text_embed.guidance_embedder.linear_2.bias, Size: torch.Size([2150])
542
+ Tensor Name: time_text_embed.guidance_embedder.linear_2.weight, Size: torch.Size([2150, 3072])
543
+ Tensor Name: time_text_embed.text_embedder.linear_1.bias, Size: torch.Size([2150])
544
+ Tensor Name: time_text_embed.text_embedder.linear_1.weight, Size: torch.Size([2150, 768])
545
+ Tensor Name: time_text_embed.text_embedder.linear_2.bias, Size: torch.Size([2150])
546
+ Tensor Name: time_text_embed.text_embedder.linear_2.weight, Size: torch.Size([2150, 3072])
547
+ Tensor Name: time_text_embed.timestep_embedder.linear_1.bias, Size: torch.Size([2150])
548
+ Tensor Name: time_text_embed.timestep_embedder.linear_1.weight, Size: torch.Size([2150, 256])
549
+ Tensor Name: time_text_embed.timestep_embedder.linear_2.bias, Size: torch.Size([2150])
550
+ Tensor Name: time_text_embed.timestep_embedder.linear_2.weight, Size: torch.Size([2150, 3072])
551
+ Tensor Name: transformer_blocks.0.attn.add_k_proj.bias, Size: torch.Size([2150])
552
+ Tensor Name: transformer_blocks.0.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
553
+ Tensor Name: transformer_blocks.0.attn.add_q_proj.bias, Size: torch.Size([2150])
554
+ Tensor Name: transformer_blocks.0.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
555
+ Tensor Name: transformer_blocks.0.attn.add_v_proj.bias, Size: torch.Size([2150])
556
+ Tensor Name: transformer_blocks.0.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
557
+ Tensor Name: transformer_blocks.0.attn.norm_added_k.weight, Size: torch.Size([89])
558
+ Tensor Name: transformer_blocks.0.attn.norm_added_q.weight, Size: torch.Size([89])
559
+ Tensor Name: transformer_blocks.0.attn.norm_k.weight, Size: torch.Size([89])
560
+ Tensor Name: transformer_blocks.0.attn.norm_q.weight, Size: torch.Size([89])
561
+ Tensor Name: transformer_blocks.0.attn.to_add_out.bias, Size: torch.Size([2150])
562
+ Tensor Name: transformer_blocks.0.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
563
+ Tensor Name: transformer_blocks.0.attn.to_k.bias, Size: torch.Size([2150])
564
+ Tensor Name: transformer_blocks.0.attn.to_k.weight, Size: torch.Size([2150, 3072])
565
+ Tensor Name: transformer_blocks.0.attn.to_out.0.bias, Size: torch.Size([2150])
566
+ Tensor Name: transformer_blocks.0.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
567
+ Tensor Name: transformer_blocks.0.attn.to_q.bias, Size: torch.Size([2150])
568
+ Tensor Name: transformer_blocks.0.attn.to_q.weight, Size: torch.Size([2150, 3072])
569
+ Tensor Name: transformer_blocks.0.attn.to_v.bias, Size: torch.Size([2150])
570
+ Tensor Name: transformer_blocks.0.attn.to_v.weight, Size: torch.Size([2150, 3072])
571
+ Tensor Name: transformer_blocks.0.ff.net.0.proj.bias, Size: torch.Size([8601])
572
+ Tensor Name: transformer_blocks.0.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
573
+ Tensor Name: transformer_blocks.0.ff.net.2.bias, Size: torch.Size([2150])
574
+ Tensor Name: transformer_blocks.0.ff.net.2.weight, Size: torch.Size([2150, 12288])
575
+ Tensor Name: transformer_blocks.0.ff_context.net.0.proj.bias, Size: torch.Size([8601])
576
+ Tensor Name: transformer_blocks.0.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
577
+ Tensor Name: transformer_blocks.0.ff_context.net.2.bias, Size: torch.Size([2150])
578
+ Tensor Name: transformer_blocks.0.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
579
+ Tensor Name: transformer_blocks.0.norm1.linear.bias, Size: torch.Size([12902])
580
+ Tensor Name: transformer_blocks.0.norm1.linear.weight, Size: torch.Size([12902, 3072])
581
+ Tensor Name: transformer_blocks.0.norm1_context.linear.bias, Size: torch.Size([12902])
582
+ Tensor Name: transformer_blocks.0.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
583
+ Tensor Name: transformer_blocks.1.attn.add_k_proj.bias, Size: torch.Size([2150])
584
+ Tensor Name: transformer_blocks.1.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
585
+ Tensor Name: transformer_blocks.1.attn.add_q_proj.bias, Size: torch.Size([2150])
586
+ Tensor Name: transformer_blocks.1.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
587
+ Tensor Name: transformer_blocks.1.attn.add_v_proj.bias, Size: torch.Size([2150])
588
+ Tensor Name: transformer_blocks.1.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
589
+ Tensor Name: transformer_blocks.1.attn.norm_added_k.weight, Size: torch.Size([89])
590
+ Tensor Name: transformer_blocks.1.attn.norm_added_q.weight, Size: torch.Size([89])
591
+ Tensor Name: transformer_blocks.1.attn.norm_k.weight, Size: torch.Size([89])
592
+ Tensor Name: transformer_blocks.1.attn.norm_q.weight, Size: torch.Size([89])
593
+ Tensor Name: transformer_blocks.1.attn.to_add_out.bias, Size: torch.Size([2150])
594
+ Tensor Name: transformer_blocks.1.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
595
+ Tensor Name: transformer_blocks.1.attn.to_k.bias, Size: torch.Size([2150])
596
+ Tensor Name: transformer_blocks.1.attn.to_k.weight, Size: torch.Size([2150, 3072])
597
+ Tensor Name: transformer_blocks.1.attn.to_out.0.bias, Size: torch.Size([2150])
598
+ Tensor Name: transformer_blocks.1.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
599
+ Tensor Name: transformer_blocks.1.attn.to_q.bias, Size: torch.Size([2150])
600
+ Tensor Name: transformer_blocks.1.attn.to_q.weight, Size: torch.Size([2150, 3072])
601
+ Tensor Name: transformer_blocks.1.attn.to_v.bias, Size: torch.Size([2150])
602
+ Tensor Name: transformer_blocks.1.attn.to_v.weight, Size: torch.Size([2150, 3072])
603
+ Tensor Name: transformer_blocks.1.ff.net.0.proj.bias, Size: torch.Size([8601])
604
+ Tensor Name: transformer_blocks.1.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
605
+ Tensor Name: transformer_blocks.1.ff.net.2.bias, Size: torch.Size([2150])
606
+ Tensor Name: transformer_blocks.1.ff.net.2.weight, Size: torch.Size([2150, 12288])
607
+ Tensor Name: transformer_blocks.1.ff_context.net.0.proj.bias, Size: torch.Size([8601])
608
+ Tensor Name: transformer_blocks.1.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
609
+ Tensor Name: transformer_blocks.1.ff_context.net.2.bias, Size: torch.Size([2150])
610
+ Tensor Name: transformer_blocks.1.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
611
+ Tensor Name: transformer_blocks.1.norm1.linear.bias, Size: torch.Size([12902])
612
+ Tensor Name: transformer_blocks.1.norm1.linear.weight, Size: torch.Size([12902, 3072])
613
+ Tensor Name: transformer_blocks.1.norm1_context.linear.bias, Size: torch.Size([12902])
614
+ Tensor Name: transformer_blocks.1.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
615
+ Tensor Name: transformer_blocks.10.attn.add_k_proj.bias, Size: torch.Size([2150])
616
+ Tensor Name: transformer_blocks.10.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
617
+ Tensor Name: transformer_blocks.10.attn.add_q_proj.bias, Size: torch.Size([2150])
618
+ Tensor Name: transformer_blocks.10.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
619
+ Tensor Name: transformer_blocks.10.attn.add_v_proj.bias, Size: torch.Size([2150])
620
+ Tensor Name: transformer_blocks.10.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
621
+ Tensor Name: transformer_blocks.10.attn.norm_added_k.weight, Size: torch.Size([89])
622
+ Tensor Name: transformer_blocks.10.attn.norm_added_q.weight, Size: torch.Size([89])
623
+ Tensor Name: transformer_blocks.10.attn.norm_k.weight, Size: torch.Size([89])
624
+ Tensor Name: transformer_blocks.10.attn.norm_q.weight, Size: torch.Size([89])
625
+ Tensor Name: transformer_blocks.10.attn.to_add_out.bias, Size: torch.Size([2150])
626
+ Tensor Name: transformer_blocks.10.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
627
+ Tensor Name: transformer_blocks.10.attn.to_k.bias, Size: torch.Size([2150])
628
+ Tensor Name: transformer_blocks.10.attn.to_k.weight, Size: torch.Size([2150, 3072])
629
+ Tensor Name: transformer_blocks.10.attn.to_out.0.bias, Size: torch.Size([2150])
630
+ Tensor Name: transformer_blocks.10.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
631
+ Tensor Name: transformer_blocks.10.attn.to_q.bias, Size: torch.Size([2150])
632
+ Tensor Name: transformer_blocks.10.attn.to_q.weight, Size: torch.Size([2150, 3072])
633
+ Tensor Name: transformer_blocks.10.attn.to_v.bias, Size: torch.Size([2150])
634
+ Tensor Name: transformer_blocks.10.attn.to_v.weight, Size: torch.Size([2150, 3072])
635
+ Tensor Name: transformer_blocks.10.ff.net.0.proj.bias, Size: torch.Size([8601])
636
+ Tensor Name: transformer_blocks.10.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
637
+ Tensor Name: transformer_blocks.10.ff.net.2.bias, Size: torch.Size([2150])
638
+ Tensor Name: transformer_blocks.10.ff.net.2.weight, Size: torch.Size([2150, 12288])
639
+ Tensor Name: transformer_blocks.10.ff_context.net.0.proj.bias, Size: torch.Size([8601])
640
+ Tensor Name: transformer_blocks.10.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
641
+ Tensor Name: transformer_blocks.10.ff_context.net.2.bias, Size: torch.Size([2150])
642
+ Tensor Name: transformer_blocks.10.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
643
+ Tensor Name: transformer_blocks.10.norm1.linear.bias, Size: torch.Size([12902])
644
+ Tensor Name: transformer_blocks.10.norm1.linear.weight, Size: torch.Size([12902, 3072])
645
+ Tensor Name: transformer_blocks.10.norm1_context.linear.bias, Size: torch.Size([12902])
646
+ Tensor Name: transformer_blocks.10.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
647
+ Tensor Name: transformer_blocks.11.attn.add_k_proj.bias, Size: torch.Size([2150])
648
+ Tensor Name: transformer_blocks.11.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
649
+ Tensor Name: transformer_blocks.11.attn.add_q_proj.bias, Size: torch.Size([2150])
650
+ Tensor Name: transformer_blocks.11.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
651
+ Tensor Name: transformer_blocks.11.attn.add_v_proj.bias, Size: torch.Size([2150])
652
+ Tensor Name: transformer_blocks.11.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
653
+ Tensor Name: transformer_blocks.11.attn.norm_added_k.weight, Size: torch.Size([89])
654
+ Tensor Name: transformer_blocks.11.attn.norm_added_q.weight, Size: torch.Size([89])
655
+ Tensor Name: transformer_blocks.11.attn.norm_k.weight, Size: torch.Size([89])
656
+ Tensor Name: transformer_blocks.11.attn.norm_q.weight, Size: torch.Size([89])
657
+ Tensor Name: transformer_blocks.11.attn.to_add_out.bias, Size: torch.Size([2150])
658
+ Tensor Name: transformer_blocks.11.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
659
+ Tensor Name: transformer_blocks.11.attn.to_k.bias, Size: torch.Size([2150])
660
+ Tensor Name: transformer_blocks.11.attn.to_k.weight, Size: torch.Size([2150, 3072])
661
+ Tensor Name: transformer_blocks.11.attn.to_out.0.bias, Size: torch.Size([2150])
662
+ Tensor Name: transformer_blocks.11.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
663
+ Tensor Name: transformer_blocks.11.attn.to_q.bias, Size: torch.Size([2150])
664
+ Tensor Name: transformer_blocks.11.attn.to_q.weight, Size: torch.Size([2150, 3072])
665
+ Tensor Name: transformer_blocks.11.attn.to_v.bias, Size: torch.Size([2150])
666
+ Tensor Name: transformer_blocks.11.attn.to_v.weight, Size: torch.Size([2150, 3072])
667
+ Tensor Name: transformer_blocks.11.ff.net.0.proj.bias, Size: torch.Size([8601])
668
+ Tensor Name: transformer_blocks.11.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
669
+ Tensor Name: transformer_blocks.11.ff.net.2.bias, Size: torch.Size([2150])
670
+ Tensor Name: transformer_blocks.11.ff.net.2.weight, Size: torch.Size([2150, 12288])
671
+ Tensor Name: transformer_blocks.11.ff_context.net.0.proj.bias, Size: torch.Size([8601])
672
+ Tensor Name: transformer_blocks.11.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
673
+ Tensor Name: transformer_blocks.11.ff_context.net.2.bias, Size: torch.Size([2150])
674
+ Tensor Name: transformer_blocks.11.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
675
+ Tensor Name: transformer_blocks.11.norm1.linear.bias, Size: torch.Size([12902])
676
+ Tensor Name: transformer_blocks.11.norm1.linear.weight, Size: torch.Size([12902, 3072])
677
+ Tensor Name: transformer_blocks.11.norm1_context.linear.bias, Size: torch.Size([12902])
678
+ Tensor Name: transformer_blocks.11.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
679
+ Tensor Name: transformer_blocks.12.attn.add_k_proj.bias, Size: torch.Size([2150])
680
+ Tensor Name: transformer_blocks.12.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
681
+ Tensor Name: transformer_blocks.12.attn.add_q_proj.bias, Size: torch.Size([2150])
682
+ Tensor Name: transformer_blocks.12.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
683
+ Tensor Name: transformer_blocks.12.attn.add_v_proj.bias, Size: torch.Size([2150])
684
+ Tensor Name: transformer_blocks.12.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
685
+ Tensor Name: transformer_blocks.12.attn.norm_added_k.weight, Size: torch.Size([89])
686
+ Tensor Name: transformer_blocks.12.attn.norm_added_q.weight, Size: torch.Size([89])
687
+ Tensor Name: transformer_blocks.12.attn.norm_k.weight, Size: torch.Size([89])
688
+ Tensor Name: transformer_blocks.12.attn.norm_q.weight, Size: torch.Size([89])
689
+ Tensor Name: transformer_blocks.12.attn.to_add_out.bias, Size: torch.Size([2150])
690
+ Tensor Name: transformer_blocks.12.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
691
+ Tensor Name: transformer_blocks.12.attn.to_k.bias, Size: torch.Size([2150])
692
+ Tensor Name: transformer_blocks.12.attn.to_k.weight, Size: torch.Size([2150, 3072])
693
+ Tensor Name: transformer_blocks.12.attn.to_out.0.bias, Size: torch.Size([2150])
694
+ Tensor Name: transformer_blocks.12.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
695
+ Tensor Name: transformer_blocks.12.attn.to_q.bias, Size: torch.Size([2150])
696
+ Tensor Name: transformer_blocks.12.attn.to_q.weight, Size: torch.Size([2150, 3072])
697
+ Tensor Name: transformer_blocks.12.attn.to_v.bias, Size: torch.Size([2150])
698
+ Tensor Name: transformer_blocks.12.attn.to_v.weight, Size: torch.Size([2150, 3072])
699
+ Tensor Name: transformer_blocks.12.ff.net.0.proj.bias, Size: torch.Size([8601])
700
+ Tensor Name: transformer_blocks.12.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
701
+ Tensor Name: transformer_blocks.12.ff.net.2.bias, Size: torch.Size([2150])
702
+ Tensor Name: transformer_blocks.12.ff.net.2.weight, Size: torch.Size([2150, 12288])
703
+ Tensor Name: transformer_blocks.12.ff_context.net.0.proj.bias, Size: torch.Size([8601])
704
+ Tensor Name: transformer_blocks.12.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
705
+ Tensor Name: transformer_blocks.12.ff_context.net.2.bias, Size: torch.Size([2150])
706
+ Tensor Name: transformer_blocks.12.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
707
+ Tensor Name: transformer_blocks.12.norm1.linear.bias, Size: torch.Size([12902])
708
+ Tensor Name: transformer_blocks.12.norm1.linear.weight, Size: torch.Size([12902, 3072])
709
+ Tensor Name: transformer_blocks.12.norm1_context.linear.bias, Size: torch.Size([12902])
710
+ Tensor Name: transformer_blocks.12.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
711
+ Tensor Name: transformer_blocks.13.attn.add_k_proj.bias, Size: torch.Size([2150])
712
+ Tensor Name: transformer_blocks.13.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
713
+ Tensor Name: transformer_blocks.13.attn.add_q_proj.bias, Size: torch.Size([2150])
714
+ Tensor Name: transformer_blocks.13.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
715
+ Tensor Name: transformer_blocks.13.attn.add_v_proj.bias, Size: torch.Size([2150])
716
+ Tensor Name: transformer_blocks.13.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
717
+ Tensor Name: transformer_blocks.13.attn.norm_added_k.weight, Size: torch.Size([89])
718
+ Tensor Name: transformer_blocks.13.attn.norm_added_q.weight, Size: torch.Size([89])
719
+ Tensor Name: transformer_blocks.13.attn.norm_k.weight, Size: torch.Size([89])
720
+ Tensor Name: transformer_blocks.13.attn.norm_q.weight, Size: torch.Size([89])
721
+ Tensor Name: transformer_blocks.13.attn.to_add_out.bias, Size: torch.Size([2150])
722
+ Tensor Name: transformer_blocks.13.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
723
+ Tensor Name: transformer_blocks.13.attn.to_k.bias, Size: torch.Size([2150])
724
+ Tensor Name: transformer_blocks.13.attn.to_k.weight, Size: torch.Size([2150, 3072])
725
+ Tensor Name: transformer_blocks.13.attn.to_out.0.bias, Size: torch.Size([2150])
726
+ Tensor Name: transformer_blocks.13.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
727
+ Tensor Name: transformer_blocks.13.attn.to_q.bias, Size: torch.Size([2150])
728
+ Tensor Name: transformer_blocks.13.attn.to_q.weight, Size: torch.Size([2150, 3072])
729
+ Tensor Name: transformer_blocks.13.attn.to_v.bias, Size: torch.Size([2150])
730
+ Tensor Name: transformer_blocks.13.attn.to_v.weight, Size: torch.Size([2150, 3072])
731
+ Tensor Name: transformer_blocks.13.ff.net.0.proj.bias, Size: torch.Size([8601])
732
+ Tensor Name: transformer_blocks.13.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
733
+ Tensor Name: transformer_blocks.13.ff.net.2.bias, Size: torch.Size([2150])
734
+ Tensor Name: transformer_blocks.13.ff.net.2.weight, Size: torch.Size([2150, 12288])
735
+ Tensor Name: transformer_blocks.13.ff_context.net.0.proj.bias, Size: torch.Size([8601])
736
+ Tensor Name: transformer_blocks.13.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
737
+ Tensor Name: transformer_blocks.13.ff_context.net.2.bias, Size: torch.Size([2150])
738
+ Tensor Name: transformer_blocks.13.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
739
+ Tensor Name: transformer_blocks.13.norm1.linear.bias, Size: torch.Size([12902])
740
+ Tensor Name: transformer_blocks.13.norm1.linear.weight, Size: torch.Size([12902, 3072])
741
+ Tensor Name: transformer_blocks.13.norm1_context.linear.bias, Size: torch.Size([12902])
742
+ Tensor Name: transformer_blocks.13.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
743
+ Tensor Name: transformer_blocks.14.attn.add_k_proj.bias, Size: torch.Size([2150])
744
+ Tensor Name: transformer_blocks.14.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
745
+ Tensor Name: transformer_blocks.14.attn.add_q_proj.bias, Size: torch.Size([2150])
746
+ Tensor Name: transformer_blocks.14.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
747
+ Tensor Name: transformer_blocks.14.attn.add_v_proj.bias, Size: torch.Size([2150])
748
+ Tensor Name: transformer_blocks.14.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
749
+ Tensor Name: transformer_blocks.14.attn.norm_added_k.weight, Size: torch.Size([89])
750
+ Tensor Name: transformer_blocks.14.attn.norm_added_q.weight, Size: torch.Size([89])
751
+ Tensor Name: transformer_blocks.14.attn.norm_k.weight, Size: torch.Size([89])
752
+ Tensor Name: transformer_blocks.14.attn.norm_q.weight, Size: torch.Size([89])
753
+ Tensor Name: transformer_blocks.14.attn.to_add_out.bias, Size: torch.Size([2150])
754
+ Tensor Name: transformer_blocks.14.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
755
+ Tensor Name: transformer_blocks.14.attn.to_k.bias, Size: torch.Size([2150])
756
+ Tensor Name: transformer_blocks.14.attn.to_k.weight, Size: torch.Size([2150, 3072])
757
+ Tensor Name: transformer_blocks.14.attn.to_out.0.bias, Size: torch.Size([2150])
758
+ Tensor Name: transformer_blocks.14.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
759
+ Tensor Name: transformer_blocks.14.attn.to_q.bias, Size: torch.Size([2150])
760
+ Tensor Name: transformer_blocks.14.attn.to_q.weight, Size: torch.Size([2150, 3072])
761
+ Tensor Name: transformer_blocks.14.attn.to_v.bias, Size: torch.Size([2150])
762
+ Tensor Name: transformer_blocks.14.attn.to_v.weight, Size: torch.Size([2150, 3072])
763
+ Tensor Name: transformer_blocks.14.ff.net.0.proj.bias, Size: torch.Size([8601])
764
+ Tensor Name: transformer_blocks.14.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
765
+ Tensor Name: transformer_blocks.14.ff.net.2.bias, Size: torch.Size([2150])
766
+ Tensor Name: transformer_blocks.14.ff.net.2.weight, Size: torch.Size([2150, 12288])
767
+ Tensor Name: transformer_blocks.14.ff_context.net.0.proj.bias, Size: torch.Size([8601])
768
+ Tensor Name: transformer_blocks.14.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
769
+ Tensor Name: transformer_blocks.14.ff_context.net.2.bias, Size: torch.Size([2150])
770
+ Tensor Name: transformer_blocks.14.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
771
+ Tensor Name: transformer_blocks.14.norm1.linear.bias, Size: torch.Size([12902])
772
+ Tensor Name: transformer_blocks.14.norm1.linear.weight, Size: torch.Size([12902, 3072])
773
+ Tensor Name: transformer_blocks.14.norm1_context.linear.bias, Size: torch.Size([12902])
774
+ Tensor Name: transformer_blocks.14.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
775
+ Tensor Name: transformer_blocks.15.attn.add_k_proj.bias, Size: torch.Size([2150])
776
+ Tensor Name: transformer_blocks.15.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
777
+ Tensor Name: transformer_blocks.15.attn.add_q_proj.bias, Size: torch.Size([2150])
778
+ Tensor Name: transformer_blocks.15.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
779
+ Tensor Name: transformer_blocks.15.attn.add_v_proj.bias, Size: torch.Size([2150])
780
+ Tensor Name: transformer_blocks.15.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
781
+ Tensor Name: transformer_blocks.15.attn.norm_added_k.weight, Size: torch.Size([89])
782
+ Tensor Name: transformer_blocks.15.attn.norm_added_q.weight, Size: torch.Size([89])
783
+ Tensor Name: transformer_blocks.15.attn.norm_k.weight, Size: torch.Size([89])
784
+ Tensor Name: transformer_blocks.15.attn.norm_q.weight, Size: torch.Size([89])
785
+ Tensor Name: transformer_blocks.15.attn.to_add_out.bias, Size: torch.Size([2150])
786
+ Tensor Name: transformer_blocks.15.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
787
+ Tensor Name: transformer_blocks.15.attn.to_k.bias, Size: torch.Size([2150])
788
+ Tensor Name: transformer_blocks.15.attn.to_k.weight, Size: torch.Size([2150, 3072])
789
+ Tensor Name: transformer_blocks.15.attn.to_out.0.bias, Size: torch.Size([2150])
790
+ Tensor Name: transformer_blocks.15.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
791
+ Tensor Name: transformer_blocks.15.attn.to_q.bias, Size: torch.Size([2150])
792
+ Tensor Name: transformer_blocks.15.attn.to_q.weight, Size: torch.Size([2150, 3072])
793
+ Tensor Name: transformer_blocks.15.attn.to_v.bias, Size: torch.Size([2150])
794
+ Tensor Name: transformer_blocks.15.attn.to_v.weight, Size: torch.Size([2150, 3072])
795
+ Tensor Name: transformer_blocks.15.ff.net.0.proj.bias, Size: torch.Size([8601])
796
+ Tensor Name: transformer_blocks.15.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
797
+ Tensor Name: transformer_blocks.15.ff.net.2.bias, Size: torch.Size([2150])
798
+ Tensor Name: transformer_blocks.15.ff.net.2.weight, Size: torch.Size([2150, 12288])
799
+ Tensor Name: transformer_blocks.15.ff_context.net.0.proj.bias, Size: torch.Size([8601])
800
+ Tensor Name: transformer_blocks.15.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
801
+ Tensor Name: transformer_blocks.15.ff_context.net.2.bias, Size: torch.Size([2150])
802
+ Tensor Name: transformer_blocks.15.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
803
+ Tensor Name: transformer_blocks.15.norm1.linear.bias, Size: torch.Size([12902])
804
+ Tensor Name: transformer_blocks.15.norm1.linear.weight, Size: torch.Size([12902, 3072])
805
+ Tensor Name: transformer_blocks.15.norm1_context.linear.bias, Size: torch.Size([12902])
806
+ Tensor Name: transformer_blocks.15.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
807
+ Tensor Name: transformer_blocks.16.attn.add_k_proj.bias, Size: torch.Size([2150])
808
+ Tensor Name: transformer_blocks.16.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
809
+ Tensor Name: transformer_blocks.16.attn.add_q_proj.bias, Size: torch.Size([2150])
810
+ Tensor Name: transformer_blocks.16.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
811
+ Tensor Name: transformer_blocks.16.attn.add_v_proj.bias, Size: torch.Size([2150])
812
+ Tensor Name: transformer_blocks.16.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
813
+ Tensor Name: transformer_blocks.16.attn.norm_added_k.weight, Size: torch.Size([89])
814
+ Tensor Name: transformer_blocks.16.attn.norm_added_q.weight, Size: torch.Size([89])
815
+ Tensor Name: transformer_blocks.16.attn.norm_k.weight, Size: torch.Size([89])
816
+ Tensor Name: transformer_blocks.16.attn.norm_q.weight, Size: torch.Size([89])
817
+ Tensor Name: transformer_blocks.16.attn.to_add_out.bias, Size: torch.Size([2150])
818
+ Tensor Name: transformer_blocks.16.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
819
+ Tensor Name: transformer_blocks.16.attn.to_k.bias, Size: torch.Size([2150])
820
+ Tensor Name: transformer_blocks.16.attn.to_k.weight, Size: torch.Size([2150, 3072])
821
+ Tensor Name: transformer_blocks.16.attn.to_out.0.bias, Size: torch.Size([2150])
822
+ Tensor Name: transformer_blocks.16.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
823
+ Tensor Name: transformer_blocks.16.attn.to_q.bias, Size: torch.Size([2150])
824
+ Tensor Name: transformer_blocks.16.attn.to_q.weight, Size: torch.Size([2150, 3072])
825
+ Tensor Name: transformer_blocks.16.attn.to_v.bias, Size: torch.Size([2150])
826
+ Tensor Name: transformer_blocks.16.attn.to_v.weight, Size: torch.Size([2150, 3072])
827
+ Tensor Name: transformer_blocks.16.ff.net.0.proj.bias, Size: torch.Size([8601])
828
+ Tensor Name: transformer_blocks.16.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
829
+ Tensor Name: transformer_blocks.16.ff.net.2.bias, Size: torch.Size([2150])
830
+ Tensor Name: transformer_blocks.16.ff.net.2.weight, Size: torch.Size([2150, 12288])
831
+ Tensor Name: transformer_blocks.16.ff_context.net.0.proj.bias, Size: torch.Size([8601])
832
+ Tensor Name: transformer_blocks.16.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
833
+ Tensor Name: transformer_blocks.16.ff_context.net.2.bias, Size: torch.Size([2150])
834
+ Tensor Name: transformer_blocks.16.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
835
+ Tensor Name: transformer_blocks.16.norm1.linear.bias, Size: torch.Size([12902])
836
+ Tensor Name: transformer_blocks.16.norm1.linear.weight, Size: torch.Size([12902, 3072])
837
+ Tensor Name: transformer_blocks.16.norm1_context.linear.bias, Size: torch.Size([12902])
838
+ Tensor Name: transformer_blocks.16.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
839
+ Tensor Name: transformer_blocks.17.attn.add_k_proj.bias, Size: torch.Size([2150])
840
+ Tensor Name: transformer_blocks.17.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
841
+ Tensor Name: transformer_blocks.17.attn.add_q_proj.bias, Size: torch.Size([2150])
842
+ Tensor Name: transformer_blocks.17.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
843
+ Tensor Name: transformer_blocks.17.attn.add_v_proj.bias, Size: torch.Size([2150])
844
+ Tensor Name: transformer_blocks.17.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
845
+ Tensor Name: transformer_blocks.17.attn.norm_added_k.weight, Size: torch.Size([89])
846
+ Tensor Name: transformer_blocks.17.attn.norm_added_q.weight, Size: torch.Size([89])
847
+ Tensor Name: transformer_blocks.17.attn.norm_k.weight, Size: torch.Size([89])
848
+ Tensor Name: transformer_blocks.17.attn.norm_q.weight, Size: torch.Size([89])
849
+ Tensor Name: transformer_blocks.17.attn.to_add_out.bias, Size: torch.Size([2150])
850
+ Tensor Name: transformer_blocks.17.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
851
+ Tensor Name: transformer_blocks.17.attn.to_k.bias, Size: torch.Size([2150])
852
+ Tensor Name: transformer_blocks.17.attn.to_k.weight, Size: torch.Size([2150, 3072])
853
+ Tensor Name: transformer_blocks.17.attn.to_out.0.bias, Size: torch.Size([2150])
854
+ Tensor Name: transformer_blocks.17.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
855
+ Tensor Name: transformer_blocks.17.attn.to_q.bias, Size: torch.Size([2150])
856
+ Tensor Name: transformer_blocks.17.attn.to_q.weight, Size: torch.Size([2150, 3072])
857
+ Tensor Name: transformer_blocks.17.attn.to_v.bias, Size: torch.Size([2150])
858
+ Tensor Name: transformer_blocks.17.attn.to_v.weight, Size: torch.Size([2150, 3072])
859
+ Tensor Name: transformer_blocks.17.ff.net.0.proj.bias, Size: torch.Size([8601])
860
+ Tensor Name: transformer_blocks.17.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
861
+ Tensor Name: transformer_blocks.17.ff.net.2.bias, Size: torch.Size([2150])
862
+ Tensor Name: transformer_blocks.17.ff.net.2.weight, Size: torch.Size([2150, 12288])
863
+ Tensor Name: transformer_blocks.17.ff_context.net.0.proj.bias, Size: torch.Size([8601])
864
+ Tensor Name: transformer_blocks.17.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
865
+ Tensor Name: transformer_blocks.17.ff_context.net.2.bias, Size: torch.Size([2150])
866
+ Tensor Name: transformer_blocks.17.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
867
+ Tensor Name: transformer_blocks.17.norm1.linear.bias, Size: torch.Size([12902])
868
+ Tensor Name: transformer_blocks.17.norm1.linear.weight, Size: torch.Size([12902, 3072])
869
+ Tensor Name: transformer_blocks.17.norm1_context.linear.bias, Size: torch.Size([12902])
870
+ Tensor Name: transformer_blocks.17.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
871
+ Tensor Name: transformer_blocks.18.attn.add_k_proj.bias, Size: torch.Size([2150])
872
+ Tensor Name: transformer_blocks.18.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
873
+ Tensor Name: transformer_blocks.18.attn.add_q_proj.bias, Size: torch.Size([2150])
874
+ Tensor Name: transformer_blocks.18.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
875
+ Tensor Name: transformer_blocks.18.attn.add_v_proj.bias, Size: torch.Size([2150])
876
+ Tensor Name: transformer_blocks.18.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
877
+ Tensor Name: transformer_blocks.18.attn.norm_added_k.weight, Size: torch.Size([89])
878
+ Tensor Name: transformer_blocks.18.attn.norm_added_q.weight, Size: torch.Size([89])
879
+ Tensor Name: transformer_blocks.18.attn.norm_k.weight, Size: torch.Size([89])
880
+ Tensor Name: transformer_blocks.18.attn.norm_q.weight, Size: torch.Size([89])
881
+ Tensor Name: transformer_blocks.18.attn.to_add_out.bias, Size: torch.Size([2150])
882
+ Tensor Name: transformer_blocks.18.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
883
+ Tensor Name: transformer_blocks.18.attn.to_k.bias, Size: torch.Size([2150])
884
+ Tensor Name: transformer_blocks.18.attn.to_k.weight, Size: torch.Size([2150, 3072])
885
+ Tensor Name: transformer_blocks.18.attn.to_out.0.bias, Size: torch.Size([2150])
886
+ Tensor Name: transformer_blocks.18.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
887
+ Tensor Name: transformer_blocks.18.attn.to_q.bias, Size: torch.Size([2150])
888
+ Tensor Name: transformer_blocks.18.attn.to_q.weight, Size: torch.Size([2150, 3072])
889
+ Tensor Name: transformer_blocks.18.attn.to_v.bias, Size: torch.Size([2150])
890
+ Tensor Name: transformer_blocks.18.attn.to_v.weight, Size: torch.Size([2150, 3072])
891
+ Tensor Name: transformer_blocks.18.ff.net.0.proj.bias, Size: torch.Size([8601])
892
+ Tensor Name: transformer_blocks.18.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
893
+ Tensor Name: transformer_blocks.18.ff.net.2.bias, Size: torch.Size([2150])
894
+ Tensor Name: transformer_blocks.18.ff.net.2.weight, Size: torch.Size([2150, 12288])
895
+ Tensor Name: transformer_blocks.18.ff_context.net.0.proj.bias, Size: torch.Size([8601])
896
+ Tensor Name: transformer_blocks.18.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
897
+ Tensor Name: transformer_blocks.18.ff_context.net.2.bias, Size: torch.Size([2150])
898
+ Tensor Name: transformer_blocks.18.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
899
+ Tensor Name: transformer_blocks.18.norm1.linear.bias, Size: torch.Size([12902])
900
+ Tensor Name: transformer_blocks.18.norm1.linear.weight, Size: torch.Size([12902, 3072])
901
+ Tensor Name: transformer_blocks.18.norm1_context.linear.bias, Size: torch.Size([12902])
902
+ Tensor Name: transformer_blocks.18.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
903
+ Tensor Name: transformer_blocks.2.attn.add_k_proj.bias, Size: torch.Size([2150])
904
+ Tensor Name: transformer_blocks.2.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
905
+ Tensor Name: transformer_blocks.2.attn.add_q_proj.bias, Size: torch.Size([2150])
906
+ Tensor Name: transformer_blocks.2.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
907
+ Tensor Name: transformer_blocks.2.attn.add_v_proj.bias, Size: torch.Size([2150])
908
+ Tensor Name: transformer_blocks.2.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
909
+ Tensor Name: transformer_blocks.2.attn.norm_added_k.weight, Size: torch.Size([89])
910
+ Tensor Name: transformer_blocks.2.attn.norm_added_q.weight, Size: torch.Size([89])
911
+ Tensor Name: transformer_blocks.2.attn.norm_k.weight, Size: torch.Size([89])
912
+ Tensor Name: transformer_blocks.2.attn.norm_q.weight, Size: torch.Size([89])
913
+ Tensor Name: transformer_blocks.2.attn.to_add_out.bias, Size: torch.Size([2150])
914
+ Tensor Name: transformer_blocks.2.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
915
+ Tensor Name: transformer_blocks.2.attn.to_k.bias, Size: torch.Size([2150])
916
+ Tensor Name: transformer_blocks.2.attn.to_k.weight, Size: torch.Size([2150, 3072])
917
+ Tensor Name: transformer_blocks.2.attn.to_out.0.bias, Size: torch.Size([2150])
918
+ Tensor Name: transformer_blocks.2.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
919
+ Tensor Name: transformer_blocks.2.attn.to_q.bias, Size: torch.Size([2150])
920
+ Tensor Name: transformer_blocks.2.attn.to_q.weight, Size: torch.Size([2150, 3072])
921
+ Tensor Name: transformer_blocks.2.attn.to_v.bias, Size: torch.Size([2150])
922
+ Tensor Name: transformer_blocks.2.attn.to_v.weight, Size: torch.Size([2150, 3072])
923
+ Tensor Name: transformer_blocks.2.ff.net.0.proj.bias, Size: torch.Size([8601])
924
+ Tensor Name: transformer_blocks.2.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
925
+ Tensor Name: transformer_blocks.2.ff.net.2.bias, Size: torch.Size([2150])
926
+ Tensor Name: transformer_blocks.2.ff.net.2.weight, Size: torch.Size([2150, 12288])
927
+ Tensor Name: transformer_blocks.2.ff_context.net.0.proj.bias, Size: torch.Size([8601])
928
+ Tensor Name: transformer_blocks.2.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
929
+ Tensor Name: transformer_blocks.2.ff_context.net.2.bias, Size: torch.Size([2150])
930
+ Tensor Name: transformer_blocks.2.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
931
+ Tensor Name: transformer_blocks.2.norm1.linear.bias, Size: torch.Size([12902])
932
+ Tensor Name: transformer_blocks.2.norm1.linear.weight, Size: torch.Size([12902, 3072])
933
+ Tensor Name: transformer_blocks.2.norm1_context.linear.bias, Size: torch.Size([12902])
934
+ Tensor Name: transformer_blocks.2.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
935
+ Tensor Name: transformer_blocks.3.attn.add_k_proj.bias, Size: torch.Size([2150])
936
+ Tensor Name: transformer_blocks.3.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
937
+ Tensor Name: transformer_blocks.3.attn.add_q_proj.bias, Size: torch.Size([2150])
938
+ Tensor Name: transformer_blocks.3.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
939
+ Tensor Name: transformer_blocks.3.attn.add_v_proj.bias, Size: torch.Size([2150])
940
+ Tensor Name: transformer_blocks.3.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
941
+ Tensor Name: transformer_blocks.3.attn.norm_added_k.weight, Size: torch.Size([89])
942
+ Tensor Name: transformer_blocks.3.attn.norm_added_q.weight, Size: torch.Size([89])
943
+ Tensor Name: transformer_blocks.3.attn.norm_k.weight, Size: torch.Size([89])
944
+ Tensor Name: transformer_blocks.3.attn.norm_q.weight, Size: torch.Size([89])
945
+ Tensor Name: transformer_blocks.3.attn.to_add_out.bias, Size: torch.Size([2150])
946
+ Tensor Name: transformer_blocks.3.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
947
+ Tensor Name: transformer_blocks.3.attn.to_k.bias, Size: torch.Size([2150])
948
+ Tensor Name: transformer_blocks.3.attn.to_k.weight, Size: torch.Size([2150, 3072])
949
+ Tensor Name: transformer_blocks.3.attn.to_out.0.bias, Size: torch.Size([2150])
950
+ Tensor Name: transformer_blocks.3.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
951
+ Tensor Name: transformer_blocks.3.attn.to_q.bias, Size: torch.Size([2150])
952
+ Tensor Name: transformer_blocks.3.attn.to_q.weight, Size: torch.Size([2150, 3072])
953
+ Tensor Name: transformer_blocks.3.attn.to_v.bias, Size: torch.Size([2150])
954
+ Tensor Name: transformer_blocks.3.attn.to_v.weight, Size: torch.Size([2150, 3072])
955
+ Tensor Name: transformer_blocks.3.ff.net.0.proj.bias, Size: torch.Size([8601])
956
+ Tensor Name: transformer_blocks.3.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
957
+ Tensor Name: transformer_blocks.3.ff.net.2.bias, Size: torch.Size([2150])
958
+ Tensor Name: transformer_blocks.3.ff.net.2.weight, Size: torch.Size([2150, 12288])
959
+ Tensor Name: transformer_blocks.3.ff_context.net.0.proj.bias, Size: torch.Size([8601])
960
+ Tensor Name: transformer_blocks.3.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
961
+ Tensor Name: transformer_blocks.3.ff_context.net.2.bias, Size: torch.Size([2150])
962
+ Tensor Name: transformer_blocks.3.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
963
+ Tensor Name: transformer_blocks.3.norm1.linear.bias, Size: torch.Size([12902])
964
+ Tensor Name: transformer_blocks.3.norm1.linear.weight, Size: torch.Size([12902, 3072])
965
+ Tensor Name: transformer_blocks.3.norm1_context.linear.bias, Size: torch.Size([12902])
966
+ Tensor Name: transformer_blocks.3.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
967
+ Tensor Name: transformer_blocks.4.attn.add_k_proj.bias, Size: torch.Size([2150])
968
+ Tensor Name: transformer_blocks.4.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
969
+ Tensor Name: transformer_blocks.4.attn.add_q_proj.bias, Size: torch.Size([2150])
970
+ Tensor Name: transformer_blocks.4.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
971
+ Tensor Name: transformer_blocks.4.attn.add_v_proj.bias, Size: torch.Size([2150])
972
+ Tensor Name: transformer_blocks.4.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
973
+ Tensor Name: transformer_blocks.4.attn.norm_added_k.weight, Size: torch.Size([89])
974
+ Tensor Name: transformer_blocks.4.attn.norm_added_q.weight, Size: torch.Size([89])
975
+ Tensor Name: transformer_blocks.4.attn.norm_k.weight, Size: torch.Size([89])
976
+ Tensor Name: transformer_blocks.4.attn.norm_q.weight, Size: torch.Size([89])
977
+ Tensor Name: transformer_blocks.4.attn.to_add_out.bias, Size: torch.Size([2150])
978
+ Tensor Name: transformer_blocks.4.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
979
+ Tensor Name: transformer_blocks.4.attn.to_k.bias, Size: torch.Size([2150])
980
+ Tensor Name: transformer_blocks.4.attn.to_k.weight, Size: torch.Size([2150, 3072])
981
+ Tensor Name: transformer_blocks.4.attn.to_out.0.bias, Size: torch.Size([2150])
982
+ Tensor Name: transformer_blocks.4.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
983
+ Tensor Name: transformer_blocks.4.attn.to_q.bias, Size: torch.Size([2150])
984
+ Tensor Name: transformer_blocks.4.attn.to_q.weight, Size: torch.Size([2150, 3072])
985
+ Tensor Name: transformer_blocks.4.attn.to_v.bias, Size: torch.Size([2150])
986
+ Tensor Name: transformer_blocks.4.attn.to_v.weight, Size: torch.Size([2150, 3072])
987
+ Tensor Name: transformer_blocks.4.ff.net.0.proj.bias, Size: torch.Size([8601])
988
+ Tensor Name: transformer_blocks.4.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
989
+ Tensor Name: transformer_blocks.4.ff.net.2.bias, Size: torch.Size([2150])
990
+ Tensor Name: transformer_blocks.4.ff.net.2.weight, Size: torch.Size([2150, 12288])
991
+ Tensor Name: transformer_blocks.4.ff_context.net.0.proj.bias, Size: torch.Size([8601])
992
+ Tensor Name: transformer_blocks.4.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
993
+ Tensor Name: transformer_blocks.4.ff_context.net.2.bias, Size: torch.Size([2150])
994
+ Tensor Name: transformer_blocks.4.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
995
+ Tensor Name: transformer_blocks.4.norm1.linear.bias, Size: torch.Size([12902])
996
+ Tensor Name: transformer_blocks.4.norm1.linear.weight, Size: torch.Size([12902, 3072])
997
+ Tensor Name: transformer_blocks.4.norm1_context.linear.bias, Size: torch.Size([12902])
998
+ Tensor Name: transformer_blocks.4.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
999
+ Tensor Name: transformer_blocks.5.attn.add_k_proj.bias, Size: torch.Size([2150])
1000
+ Tensor Name: transformer_blocks.5.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
1001
+ Tensor Name: transformer_blocks.5.attn.add_q_proj.bias, Size: torch.Size([2150])
1002
+ Tensor Name: transformer_blocks.5.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
1003
+ Tensor Name: transformer_blocks.5.attn.add_v_proj.bias, Size: torch.Size([2150])
1004
+ Tensor Name: transformer_blocks.5.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
1005
+ Tensor Name: transformer_blocks.5.attn.norm_added_k.weight, Size: torch.Size([89])
1006
+ Tensor Name: transformer_blocks.5.attn.norm_added_q.weight, Size: torch.Size([89])
1007
+ Tensor Name: transformer_blocks.5.attn.norm_k.weight, Size: torch.Size([89])
1008
+ Tensor Name: transformer_blocks.5.attn.norm_q.weight, Size: torch.Size([89])
1009
+ Tensor Name: transformer_blocks.5.attn.to_add_out.bias, Size: torch.Size([2150])
1010
+ Tensor Name: transformer_blocks.5.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
1011
+ Tensor Name: transformer_blocks.5.attn.to_k.bias, Size: torch.Size([2150])
1012
+ Tensor Name: transformer_blocks.5.attn.to_k.weight, Size: torch.Size([2150, 3072])
1013
+ Tensor Name: transformer_blocks.5.attn.to_out.0.bias, Size: torch.Size([2150])
1014
+ Tensor Name: transformer_blocks.5.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
1015
+ Tensor Name: transformer_blocks.5.attn.to_q.bias, Size: torch.Size([2150])
1016
+ Tensor Name: transformer_blocks.5.attn.to_q.weight, Size: torch.Size([2150, 3072])
1017
+ Tensor Name: transformer_blocks.5.attn.to_v.bias, Size: torch.Size([2150])
1018
+ Tensor Name: transformer_blocks.5.attn.to_v.weight, Size: torch.Size([2150, 3072])
1019
+ Tensor Name: transformer_blocks.5.ff.net.0.proj.bias, Size: torch.Size([8601])
1020
+ Tensor Name: transformer_blocks.5.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
1021
+ Tensor Name: transformer_blocks.5.ff.net.2.bias, Size: torch.Size([2150])
1022
+ Tensor Name: transformer_blocks.5.ff.net.2.weight, Size: torch.Size([2150, 12288])
1023
+ Tensor Name: transformer_blocks.5.ff_context.net.0.proj.bias, Size: torch.Size([8601])
1024
+ Tensor Name: transformer_blocks.5.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
1025
+ Tensor Name: transformer_blocks.5.ff_context.net.2.bias, Size: torch.Size([2150])
1026
+ Tensor Name: transformer_blocks.5.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
1027
+ Tensor Name: transformer_blocks.5.norm1.linear.bias, Size: torch.Size([12902])
1028
+ Tensor Name: transformer_blocks.5.norm1.linear.weight, Size: torch.Size([12902, 3072])
1029
+ Tensor Name: transformer_blocks.5.norm1_context.linear.bias, Size: torch.Size([12902])
1030
+ Tensor Name: transformer_blocks.5.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
1031
+ Tensor Name: transformer_blocks.6.attn.add_k_proj.bias, Size: torch.Size([2150])
1032
+ Tensor Name: transformer_blocks.6.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
1033
+ Tensor Name: transformer_blocks.6.attn.add_q_proj.bias, Size: torch.Size([2150])
1034
+ Tensor Name: transformer_blocks.6.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
1035
+ Tensor Name: transformer_blocks.6.attn.add_v_proj.bias, Size: torch.Size([2150])
1036
+ Tensor Name: transformer_blocks.6.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
1037
+ Tensor Name: transformer_blocks.6.attn.norm_added_k.weight, Size: torch.Size([89])
1038
+ Tensor Name: transformer_blocks.6.attn.norm_added_q.weight, Size: torch.Size([89])
1039
+ Tensor Name: transformer_blocks.6.attn.norm_k.weight, Size: torch.Size([89])
1040
+ Tensor Name: transformer_blocks.6.attn.norm_q.weight, Size: torch.Size([89])
1041
+ Tensor Name: transformer_blocks.6.attn.to_add_out.bias, Size: torch.Size([2150])
1042
+ Tensor Name: transformer_blocks.6.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
1043
+ Tensor Name: transformer_blocks.6.attn.to_k.bias, Size: torch.Size([2150])
1044
+ Tensor Name: transformer_blocks.6.attn.to_k.weight, Size: torch.Size([2150, 3072])
1045
+ Tensor Name: transformer_blocks.6.attn.to_out.0.bias, Size: torch.Size([2150])
1046
+ Tensor Name: transformer_blocks.6.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
1047
+ Tensor Name: transformer_blocks.6.attn.to_q.bias, Size: torch.Size([2150])
1048
+ Tensor Name: transformer_blocks.6.attn.to_q.weight, Size: torch.Size([2150, 3072])
1049
+ Tensor Name: transformer_blocks.6.attn.to_v.bias, Size: torch.Size([2150])
1050
+ Tensor Name: transformer_blocks.6.attn.to_v.weight, Size: torch.Size([2150, 3072])
1051
+ Tensor Name: transformer_blocks.6.ff.net.0.proj.bias, Size: torch.Size([8601])
1052
+ Tensor Name: transformer_blocks.6.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
1053
+ Tensor Name: transformer_blocks.6.ff.net.2.bias, Size: torch.Size([2150])
1054
+ Tensor Name: transformer_blocks.6.ff.net.2.weight, Size: torch.Size([2150, 12288])
1055
+ Tensor Name: transformer_blocks.6.ff_context.net.0.proj.bias, Size: torch.Size([8601])
1056
+ Tensor Name: transformer_blocks.6.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
1057
+ Tensor Name: transformer_blocks.6.ff_context.net.2.bias, Size: torch.Size([2150])
1058
+ Tensor Name: transformer_blocks.6.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
1059
+ Tensor Name: transformer_blocks.6.norm1.linear.bias, Size: torch.Size([12902])
1060
+ Tensor Name: transformer_blocks.6.norm1.linear.weight, Size: torch.Size([12902, 3072])
1061
+ Tensor Name: transformer_blocks.6.norm1_context.linear.bias, Size: torch.Size([12902])
1062
+ Tensor Name: transformer_blocks.6.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
1063
+ Tensor Name: transformer_blocks.7.attn.add_k_proj.bias, Size: torch.Size([2150])
1064
+ Tensor Name: transformer_blocks.7.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
1065
+ Tensor Name: transformer_blocks.7.attn.add_q_proj.bias, Size: torch.Size([2150])
1066
+ Tensor Name: transformer_blocks.7.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
1067
+ Tensor Name: transformer_blocks.7.attn.add_v_proj.bias, Size: torch.Size([2150])
1068
+ Tensor Name: transformer_blocks.7.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
1069
+ Tensor Name: transformer_blocks.7.attn.norm_added_k.weight, Size: torch.Size([89])
1070
+ Tensor Name: transformer_blocks.7.attn.norm_added_q.weight, Size: torch.Size([89])
1071
+ Tensor Name: transformer_blocks.7.attn.norm_k.weight, Size: torch.Size([89])
1072
+ Tensor Name: transformer_blocks.7.attn.norm_q.weight, Size: torch.Size([89])
1073
+ Tensor Name: transformer_blocks.7.attn.to_add_out.bias, Size: torch.Size([2150])
1074
+ Tensor Name: transformer_blocks.7.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
1075
+ Tensor Name: transformer_blocks.7.attn.to_k.bias, Size: torch.Size([2150])
1076
+ Tensor Name: transformer_blocks.7.attn.to_k.weight, Size: torch.Size([2150, 3072])
1077
+ Tensor Name: transformer_blocks.7.attn.to_out.0.bias, Size: torch.Size([2150])
1078
+ Tensor Name: transformer_blocks.7.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
1079
+ Tensor Name: transformer_blocks.7.attn.to_q.bias, Size: torch.Size([2150])
1080
+ Tensor Name: transformer_blocks.7.attn.to_q.weight, Size: torch.Size([2150, 3072])
1081
+ Tensor Name: transformer_blocks.7.attn.to_v.bias, Size: torch.Size([2150])
1082
+ Tensor Name: transformer_blocks.7.attn.to_v.weight, Size: torch.Size([2150, 3072])
1083
+ Tensor Name: transformer_blocks.7.ff.net.0.proj.bias, Size: torch.Size([8601])
1084
+ Tensor Name: transformer_blocks.7.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
1085
+ Tensor Name: transformer_blocks.7.ff.net.2.bias, Size: torch.Size([2150])
1086
+ Tensor Name: transformer_blocks.7.ff.net.2.weight, Size: torch.Size([2150, 12288])
1087
+ Tensor Name: transformer_blocks.7.ff_context.net.0.proj.bias, Size: torch.Size([8601])
1088
+ Tensor Name: transformer_blocks.7.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
1089
+ Tensor Name: transformer_blocks.7.ff_context.net.2.bias, Size: torch.Size([2150])
1090
+ Tensor Name: transformer_blocks.7.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
1091
+ Tensor Name: transformer_blocks.7.norm1.linear.bias, Size: torch.Size([12902])
1092
+ Tensor Name: transformer_blocks.7.norm1.linear.weight, Size: torch.Size([12902, 3072])
1093
+ Tensor Name: transformer_blocks.7.norm1_context.linear.bias, Size: torch.Size([12902])
1094
+ Tensor Name: transformer_blocks.7.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
1095
+ Tensor Name: transformer_blocks.8.attn.add_k_proj.bias, Size: torch.Size([2150])
1096
+ Tensor Name: transformer_blocks.8.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
1097
+ Tensor Name: transformer_blocks.8.attn.add_q_proj.bias, Size: torch.Size([2150])
1098
+ Tensor Name: transformer_blocks.8.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
1099
+ Tensor Name: transformer_blocks.8.attn.add_v_proj.bias, Size: torch.Size([2150])
1100
+ Tensor Name: transformer_blocks.8.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
1101
+ Tensor Name: transformer_blocks.8.attn.norm_added_k.weight, Size: torch.Size([89])
1102
+ Tensor Name: transformer_blocks.8.attn.norm_added_q.weight, Size: torch.Size([89])
1103
+ Tensor Name: transformer_blocks.8.attn.norm_k.weight, Size: torch.Size([89])
1104
+ Tensor Name: transformer_blocks.8.attn.norm_q.weight, Size: torch.Size([89])
1105
+ Tensor Name: transformer_blocks.8.attn.to_add_out.bias, Size: torch.Size([2150])
1106
+ Tensor Name: transformer_blocks.8.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
1107
+ Tensor Name: transformer_blocks.8.attn.to_k.bias, Size: torch.Size([2150])
1108
+ Tensor Name: transformer_blocks.8.attn.to_k.weight, Size: torch.Size([2150, 3072])
1109
+ Tensor Name: transformer_blocks.8.attn.to_out.0.bias, Size: torch.Size([2150])
1110
+ Tensor Name: transformer_blocks.8.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
1111
+ Tensor Name: transformer_blocks.8.attn.to_q.bias, Size: torch.Size([2150])
1112
+ Tensor Name: transformer_blocks.8.attn.to_q.weight, Size: torch.Size([2150, 3072])
1113
+ Tensor Name: transformer_blocks.8.attn.to_v.bias, Size: torch.Size([2150])
1114
+ Tensor Name: transformer_blocks.8.attn.to_v.weight, Size: torch.Size([2150, 3072])
1115
+ Tensor Name: transformer_blocks.8.ff.net.0.proj.bias, Size: torch.Size([8601])
1116
+ Tensor Name: transformer_blocks.8.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
1117
+ Tensor Name: transformer_blocks.8.ff.net.2.bias, Size: torch.Size([2150])
1118
+ Tensor Name: transformer_blocks.8.ff.net.2.weight, Size: torch.Size([2150, 12288])
1119
+ Tensor Name: transformer_blocks.8.ff_context.net.0.proj.bias, Size: torch.Size([8601])
1120
+ Tensor Name: transformer_blocks.8.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
1121
+ Tensor Name: transformer_blocks.8.ff_context.net.2.bias, Size: torch.Size([2150])
1122
+ Tensor Name: transformer_blocks.8.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
1123
+ Tensor Name: transformer_blocks.8.norm1.linear.bias, Size: torch.Size([12902])
1124
+ Tensor Name: transformer_blocks.8.norm1.linear.weight, Size: torch.Size([12902, 3072])
1125
+ Tensor Name: transformer_blocks.8.norm1_context.linear.bias, Size: torch.Size([12902])
1126
+ Tensor Name: transformer_blocks.8.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
1127
+ Tensor Name: transformer_blocks.9.attn.add_k_proj.bias, Size: torch.Size([2150])
1128
+ Tensor Name: transformer_blocks.9.attn.add_k_proj.weight, Size: torch.Size([2150, 3072])
1129
+ Tensor Name: transformer_blocks.9.attn.add_q_proj.bias, Size: torch.Size([2150])
1130
+ Tensor Name: transformer_blocks.9.attn.add_q_proj.weight, Size: torch.Size([2150, 3072])
1131
+ Tensor Name: transformer_blocks.9.attn.add_v_proj.bias, Size: torch.Size([2150])
1132
+ Tensor Name: transformer_blocks.9.attn.add_v_proj.weight, Size: torch.Size([2150, 3072])
1133
+ Tensor Name: transformer_blocks.9.attn.norm_added_k.weight, Size: torch.Size([89])
1134
+ Tensor Name: transformer_blocks.9.attn.norm_added_q.weight, Size: torch.Size([89])
1135
+ Tensor Name: transformer_blocks.9.attn.norm_k.weight, Size: torch.Size([89])
1136
+ Tensor Name: transformer_blocks.9.attn.norm_q.weight, Size: torch.Size([89])
1137
+ Tensor Name: transformer_blocks.9.attn.to_add_out.bias, Size: torch.Size([2150])
1138
+ Tensor Name: transformer_blocks.9.attn.to_add_out.weight, Size: torch.Size([2150, 3072])
1139
+ Tensor Name: transformer_blocks.9.attn.to_k.bias, Size: torch.Size([2150])
1140
+ Tensor Name: transformer_blocks.9.attn.to_k.weight, Size: torch.Size([2150, 3072])
1141
+ Tensor Name: transformer_blocks.9.attn.to_out.0.bias, Size: torch.Size([2150])
1142
+ Tensor Name: transformer_blocks.9.attn.to_out.0.weight, Size: torch.Size([2150, 3072])
1143
+ Tensor Name: transformer_blocks.9.attn.to_q.bias, Size: torch.Size([2150])
1144
+ Tensor Name: transformer_blocks.9.attn.to_q.weight, Size: torch.Size([2150, 3072])
1145
+ Tensor Name: transformer_blocks.9.attn.to_v.bias, Size: torch.Size([2150])
1146
+ Tensor Name: transformer_blocks.9.attn.to_v.weight, Size: torch.Size([2150, 3072])
1147
+ Tensor Name: transformer_blocks.9.ff.net.0.proj.bias, Size: torch.Size([8601])
1148
+ Tensor Name: transformer_blocks.9.ff.net.0.proj.weight, Size: torch.Size([8601, 3072])
1149
+ Tensor Name: transformer_blocks.9.ff.net.2.bias, Size: torch.Size([2150])
1150
+ Tensor Name: transformer_blocks.9.ff.net.2.weight, Size: torch.Size([2150, 12288])
1151
+ Tensor Name: transformer_blocks.9.ff_context.net.0.proj.bias, Size: torch.Size([8601])
1152
+ Tensor Name: transformer_blocks.9.ff_context.net.0.proj.weight, Size: torch.Size([8601, 3072])
1153
+ Tensor Name: transformer_blocks.9.ff_context.net.2.bias, Size: torch.Size([2150])
1154
+ Tensor Name: transformer_blocks.9.ff_context.net.2.weight, Size: torch.Size([2150, 12288])
1155
+ Tensor Name: transformer_blocks.9.norm1.linear.bias, Size: torch.Size([12902])
1156
+ Tensor Name: transformer_blocks.9.norm1.linear.weight, Size: torch.Size([12902, 3072])
1157
+ Tensor Name: transformer_blocks.9.norm1_context.linear.bias, Size: torch.Size([12902])
1158
+ Tensor Name: transformer_blocks.9.norm1_context.linear.weight, Size: torch.Size([12902, 3072])
1159
+ Tensor Name: x_embedder.bias, Size: torch.Size([2150])
1160
+ Tensor Name: x_embedder.weight, Size: torch.Size([2150, 64])