Update README.md
Browse files
README.md
CHANGED
@@ -23,4 +23,12 @@ model-index:
|
|
23 |
|
24 |
---
|
25 |
|
26 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
---
|
25 |
|
26 |
+
# Model description
|
27 |
+
|
28 |
+
paper: [Characterizing Verbatim Short-Term Memory in Neural Language Models](https://doi.org/10.48550/arXiv.2210.13569)
|
29 |
+
|
30 |
+
This is a gpt2-small-like decoder-only transformer model trained on a 40M token subset of the [wikitext-103 dataset](https://paperswithcode.com/dataset/wikitext-103).
|
31 |
+
|
32 |
+
# Intended uses
|
33 |
+
|
34 |
+
This checkpoint is intended for research purposes, for example those interested in studying the behavior of transformer language models trained on smaller datasets.
|