TheDrummer commited on
Commit
a886873
1 Parent(s): f6eabc1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -91,7 +91,7 @@ WIP
91
  - The duplicated layers on all layer types (except one) are extra sensitive. `post_attention_layernorm` interestingly had some changes in the upscale's duplicated layers, unlike Cydonia where latter layers were completely unchanged.
92
  - The duplicated layers in `o_proj` are less sensitive for some reason.
93
 
94
- # Further Experiments
95
  Given how the duplicated layers seem to have a stabilizing effect, it begs the question: What if we duplicate only ONE layer? What about five layers?
96
  - Will fewer empty layers dampen the stabilizing effect?
97
  - Will the few empty layers get 'filled' quickly? Will the 600MB dataset be enough?
 
91
  - The duplicated layers on all layer types (except one) are extra sensitive. `post_attention_layernorm` interestingly had some changes in the upscale's duplicated layers, unlike Cydonia where latter layers were completely unchanged.
92
  - The duplicated layers in `o_proj` are less sensitive for some reason.
93
 
94
+ # Further Experimentation
95
  Given how the duplicated layers seem to have a stabilizing effect, it begs the question: What if we duplicate only ONE layer? What about five layers?
96
  - Will fewer empty layers dampen the stabilizing effect?
97
  - Will the few empty layers get 'filled' quickly? Will the 600MB dataset be enough?