Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,26 @@ tags:
|
|
13 |
- KoreanGPT
|
14 |
---
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
# KoLLaMA Model Card
|
17 |
|
18 |
KoLLaMA (7B) trained on Korean/English/Code dataset with LLaMA Architecture via JAX.
|
|
|
13 |
- KoreanGPT
|
14 |
---
|
15 |
|
16 |
+
> π§ Note: this repo is under construction π§
|
17 |
+
|
18 |
+
## Todo
|
19 |
+
|
20 |
+
β
- finish
|
21 |
+
|
22 |
+
π - working on it
|
23 |
+
|
24 |
+
- β
Train new BBPE Tokenizer
|
25 |
+
- β
Test train code on TPUv4 Pods (with model parallel)
|
26 |
+
- β
Converting test (jax to PyTorch)
|
27 |
+
- π LM train validation on minimal dataset (1 sentence 1000 step)
|
28 |
+
- Build Data Shuffler (curriculum learning)
|
29 |
+
- Train 7B Model
|
30 |
+
- Train 13B Model
|
31 |
+
- Train 33B Model
|
32 |
+
- Train 65B Model
|
33 |
+
|
34 |
+
|
35 |
+
|
36 |
# KoLLaMA Model Card
|
37 |
|
38 |
KoLLaMA (7B) trained on Korean/English/Code dataset with LLaMA Architecture via JAX.
|