Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,7 @@ datasets:
|
|
18 |
This is a JAX/Flax-based transformer language model trained on a Japanese dataset. It is based on the official Flax example code ([lm1b](https://github.com/google/flax/tree/main/examples/lm1b)).
|
19 |
|
20 |
## Update Log
|
|
|
21 |
* 2024/05/13 FlaxAutoModelForCausalLM is now supported with custom model code added.
|
22 |
|
23 |
## Source Code
|
@@ -34,11 +35,22 @@ We've modified Flax's 'lm1b' example to train on Japanese dataset. You can find
|
|
34 |
|
35 |
| Model | Params | Layers | Dim | Heads | PPL | Dataset | Training time |
|
36 |
|-|-|-|-|-|-|-|-|
|
37 |
-
| lm1b-default | 0.05B | 6 | 512 | 8 | 22.67 | lm1b | 0.5 days |
|
38 |
-
| transformer-lm-japanese-default | 0.05B | 6 | 512 | 8 | 66.38 | cc100/ja | 0.5 days |
|
39 |
| transformer-lm-japanese-0.1b | 0.1B | 12 | 768 | 12 | 35.22 | wiki40b/ja | 1.5 days |
|
40 |
|
41 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
## Usage: FlaxAutoModel
|
44 |
|
|
|
18 |
This is a JAX/Flax-based transformer language model trained on a Japanese dataset. It is based on the official Flax example code ([lm1b](https://github.com/google/flax/tree/main/examples/lm1b)).
|
19 |
|
20 |
## Update Log
|
21 |
+
* 2024/05/20 Added JGLUE 4-task benchmark scores.
|
22 |
* 2024/05/13 FlaxAutoModelForCausalLM is now supported with custom model code added.
|
23 |
|
24 |
## Source Code
|
|
|
35 |
|
36 |
| Model | Params | Layers | Dim | Heads | PPL | Dataset | Training time |
|
37 |
|-|-|-|-|-|-|-|-|
|
|
|
|
|
38 |
| transformer-lm-japanese-0.1b | 0.1B | 12 | 768 | 12 | 35.22 | wiki40b/ja | 1.5 days |
|
39 |
|
40 |
+
## Benchmarking
|
41 |
+
|
42 |
+
* **JGLUE 4-task (2024/05/20)**
|
43 |
+
|
44 |
+
- *We used [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness) library for evaluation.*
|
45 |
+
- *We modified the harness to work with the FlaxAutoModel for evaluating JAX/Flax models. See the code [here](https://github.com/FookieMonster/lm-evaluation-harness).*
|
46 |
+
- *We evaluated four tasks: JCommonsenseQA-1.1, JNLI-1.3, MARC-ja-1.1, and JSQuAD-1.1.*
|
47 |
+
- *All evaluations used version 0.3 of the prompt template and were zero-shot.*
|
48 |
+
- *The number of few-shots is 0,0,0,0.*
|
49 |
+
|
50 |
+
| Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
|
51 |
+
| :-- | :-- | :-- | :-- | :-- | :-- |
|
52 |
+
| transformer-lm-japanese-0.1b | 41.19 | 25.47 | 45.60 | 85.46 | 8.24 |
|
53 |
+
| Reference: rinna/japanese-gpt-neox-small | 40.75 | 40.39 | 29.13 | 85.48 | 8.02 |
|
54 |
|
55 |
## Usage: FlaxAutoModel
|
56 |
|