Update README.md
Browse files
README.md
CHANGED
@@ -60,7 +60,7 @@ The data distribution by language (estimated) is as follows:
|
|
60 |
- Italian: ~20%
|
61 |
|
62 |
The training data was prepared using [lm-datasets](https://github.com/malteos/lm-datasets).
|
63 |
-
The exact data
|
64 |
|
65 |
## Training settings
|
66 |
|
|
|
60 |
- Italian: ~20%
|
61 |
|
62 |
The training data was prepared using [lm-datasets](https://github.com/malteos/lm-datasets).
|
63 |
+
The exact data configuration is [here](https://huggingface.co/occiglot/occiglot-7b-eu5/blob/main/lm-datasets-config.yml).
|
64 |
|
65 |
## Training settings
|
66 |
|