internetoftim
commited on
Commit
•
31d8e00
1
Parent(s):
3d2a3c3
Update README.md
Browse files
README.md
CHANGED
@@ -23,13 +23,16 @@ GPT2 is a large transformer-based language model. It is built using transformer
|
|
23 |
|
24 |
Paper link : [Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf)
|
25 |
|
|
|
26 |
|
27 |
-
## Training and evaluation data
|
28 |
This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on the [wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext) dataset.
|
29 |
It achieves the following results on the evaluation set:
|
30 |
- Loss: 2.6973
|
31 |
|
32 |
-
|
|
|
|
|
|
|
33 |
|
34 |
## Training procedure
|
35 |
|
|
|
23 |
|
24 |
Paper link : [Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf)
|
25 |
|
26 |
+
## Intended uses & limitations
|
27 |
|
|
|
28 |
This model is a fine-tuned version of [gpt2-medium](https://huggingface.co/gpt2-medium) on the [wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext) dataset.
|
29 |
It achieves the following results on the evaluation set:
|
30 |
- Loss: 2.6973
|
31 |
|
32 |
+
## Training and evaluation data
|
33 |
+
|
34 |
+
Trained on wikipedia dataset:
|
35 |
+
- [HuggingFace/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext) dataset
|
36 |
|
37 |
## Training procedure
|
38 |
|