victormiller
commited on
Commit
•
74bef6c
1
Parent(s):
751dbe2
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,33 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
---
|
5 |
+
license: apache-2.0
|
6 |
+
---
|
7 |
+
# LLM360 Research Suite: K2 Loss Spike 2
|
8 |
+
During the first K2 training phase, we encountered two loss spikes. This repo contains 8 checkpoints that capture the training dynamics during the loss spikes.
|
9 |
+
|
10 |
+
<img src="k2_spike_1.png" alt="k2 spike 1"/>
|
11 |
+
|
12 |
+
# Purpose
|
13 |
+
Loss spikes are still a relatively unknown phenomena. By making these spikes and associated training details available, we hope others use these artifacts to further the worlds knowledge on this topic.
|
14 |
+
|
15 |
+
## All Checkpoints
|
16 |
+
| Checkpoints | |
|
17 |
+
| ----------- | ----------- |
|
18 |
+
| [Checkpoint 186](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_186) | [Checkpoint 194](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_194) |
|
19 |
+
| [Checkpoint 188](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_188) | [Checkpoint 196](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_196) |
|
20 |
+
| [Checkpoint 190](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_190) | [Checkpoint 198](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_198) |
|
21 |
+
| [Checkpoint 192](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_192) | [Checkpoint 200](https://huggingface.co/LLM360/K2-Spike-2/tree/spike_ckpt_200) |
|
22 |
+
|
23 |
+
|
24 |
+
[to find all branches: git branch -a]
|
25 |
+
|
26 |
+
## Loss Spike's on the LLM360 Evaluation Suite
|
27 |
+
|
28 |
+
something here
|
29 |
+
|
30 |
+
## About the LLM360 Research Suite
|
31 |
+
The LLM360 Research Suite is a comprehensive set of large language model (LLM) artifacts from Amber, CrystalCoder, and K2 for academic and industry researchers to explore LLM training dynamics. Additional resources can be found at llm360.ai.
|
32 |
+
|
33 |
+
|