Locutusque
commited on
Commit
•
51fce0d
1
Parent(s):
7fad8c9
Update README.md
Browse files
README.md
CHANGED
@@ -3,8 +3,25 @@ license: cc-by-nc-4.0
|
|
3 |
language:
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
---
|
7 |
-
#
|
8 |
-
This model
|
9 |
-
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
language:
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
+
datasets:
|
7 |
+
- Skylion007/openwebtext
|
8 |
+
- Locutusque/TM-DATA
|
9 |
+
inference:
|
10 |
+
parameters:
|
11 |
+
do_sample: True
|
12 |
+
temperature: 0.5
|
13 |
+
top_p: 0.5
|
14 |
+
top_k: 39
|
15 |
+
max_new_tokens: 250
|
16 |
+
repetition_penalty: 1.15
|
17 |
---
|
18 |
+
# Training
|
19 |
+
This model was trained on two datasets, shown in this model page.
|
20 |
+
- Skylion007/openwebtext: 1,000,000 examples at a batch size of 32-4096 (1 epoch)
|
21 |
+
- Locutusque/TM-DATA: All examples at a batch size of 12288 (3 epochs)
|
22 |
+
Training took approximately 500 GPU hours on a single Titan V.
|
23 |
+
# Metrics
|
24 |
+
You can look at the training metrics here:
|
25 |
+
https://wandb.ai/locutusque/TinyMistral-V2/runs/g0rvw6wc
|
26 |
+
# License
|
27 |
+
This model is released under the cc-by-nc-4.0 license. This is because the data used to train this model is under this same license.
|