Locutusque
commited on
Commit
•
de5fb0a
1
Parent(s):
879e2ee
Update README.md
Browse files
README.md
CHANGED
@@ -4,18 +4,6 @@ language:
|
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
-
Work in progress...
|
8 |
-
|
9 |
-
Like version 1, this model will be trained on a single GPU, with hopes of getting better peformance.
|
10 |
-
# Roadmap
|
11 |
-
|
12 |
-
- Train on 1,000,000 examples of Skylion007/openwebtext at a learning rate of 3e-4 and batch size of 32
|
13 |
-
- Once perplexity reaches an average of ~100, a cosine scheduler will be applied, and batch size will be increased to 4096
|
14 |
-
- Once the perplexity reaches an average of 50, the model will be trained on graelo/wikipedia and mattymchen/refinedweb-3m, and the batch size will be increased to 12,288.
|
15 |
-
|
16 |
-
- I'm open to any suggestions to modify this roadmap if you feel it isn't sufficient!
|
17 |
-
# Disclaimer
|
18 |
-
This model may be cancelled if performance improvement is not seen over its predecessor. The roadmap may also be changed during training.
|
19 |
# Release date
|
20 |
This model is set to be released by January 7, 2024. This date may be extended.
|
21 |
Watch the training live here:
|
|
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
# Release date
|
8 |
This model is set to be released by January 7, 2024. This date may be extended.
|
9 |
Watch the training live here:
|