Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,14 @@ Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).
|
|
19 |
- 4 seconds per step, total 3200 steps
|
20 |
- Checkpoint saved every 320 steps
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
## Demo
|
23 |
|
24 |
Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo
|
|
|
19 |
- 4 seconds per step, total 3200 steps
|
20 |
- Checkpoint saved every 320 steps
|
21 |
|
22 |
+
## Training loss
|
23 |
+
|
24 |
+
To experimentally check the neural scaling law:
|
25 |
+
|
26 |
+
![baseline/analysis/loss_plot2.png](baseline/analysis/loss_plot2.png)
|
27 |
+
|
28 |
+
(Fitted line: `log y = -0.11 * log x + 0.9` where x is step (0 to 3200) and y is the training loss)
|
29 |
+
|
30 |
## Demo
|
31 |
|
32 |
Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo
|