TheDrummer
commited on
Commit
•
6d67d36
1
Parent(s):
488e7d0
Update README.md
Browse files
README.md
CHANGED
@@ -126,4 +126,5 @@ Given how the duplicated layers seem to have a stabilizing effect, it begs the q
|
|
126 |
|
127 |
### Can you replicate this effect on normal models by freezing layers?
|
128 |
|
129 |
-
### We've so far hypothesized that training 'slowly fills' the duplicated layers. If we intentionally undercook, will the duplicated layers look *underfilled* or can you fill it up with a few steps? In other words,
|
|
|
|
126 |
|
127 |
### Can you replicate this effect on normal models by freezing layers?
|
128 |
|
129 |
+
### We've so far hypothesized that training 'slowly fills' the duplicated layers. If we intentionally undercook, will the duplicated layers look *underfilled* or can you fill it up with a few steps? In other words, can a single/few updates to the model reconnect the duplicated layers?
|
130 |
+
- Are we really repairing the 'neurons' step-by-step, or have they been significantly rearranged by the first (few?) steps?
|