adamelliotfields commited on
Commit
9c4b3dc
1 Parent(s): 0dc08e6

Update README

Browse files
Files changed (1) hide show
  1. README.md +48 -24
README.md CHANGED
@@ -1,44 +1,68 @@
1
  ---
2
  library_name: keras
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  ## Model description
6
 
7
- More information needed
8
 
9
  ## Intended uses & limitations
10
 
11
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ## Training and evaluation data
14
 
15
- More information needed
16
 
17
  ## Training procedure
18
 
19
- ### Training hyperparameters
20
 
21
- The following hyperparameters were used during training:
22
-
23
- | Hyperparameters | Value |
24
- | :-- | :-- |
25
- | name | AdamW |
26
- | weight_decay | 0.001 |
27
- | clipnorm | None |
28
- | global_clipnorm | None |
29
- | clipvalue | None |
30
- | use_ema | False |
31
- | ema_momentum | 0.99 |
32
- | ema_overwrite_frequency | None |
33
- | jit_compile | False |
34
- | is_legacy_optimizer | False |
35
- | learning_rate | 0.0002500000118743628 |
36
- | beta_1 | 0.9 |
37
- | beta_2 | 0.999 |
38
- | epsilon | 1e-07 |
39
- | amsgrad | False |
40
- | training_precision | float32 |
41
 
 
 
 
 
 
 
 
42
 
43
  ## Model Plot
44
 
 
1
  ---
2
  library_name: keras
3
+ license: mit
4
+ datasets:
5
+ - karpathy/tiny_shakespeare
6
+ metrics:
7
+ - accuracy
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - lstm
11
  ---
12
 
13
  ## Model description
14
 
15
+ LSTM trained on Andrej Karpathy's [`tiny_shakespeare`](https://huggingface.co/datasets/karpathy/tiny_shakespeare) dataset, from his blog post, [The Unreasonable Effectiveness of Recurrent Neural Networks](https://karpathy.github.io/2015/05/21/rnn-effectiveness/).
16
 
17
  ## Intended uses & limitations
18
 
19
+ The model predicts the next character based on a variable-length input sequence. After `18` epochs of training, the model is generating text that is somewhat coherent.
20
+
21
+ ```py
22
+ def generate_text(model, encoder, text, n):
23
+ vocab = encoder.get_vocabulary()
24
+ generated_text = text
25
+ for _ in range(n):
26
+ encoded = encoder([generated_text])
27
+ pred = model.predict(encoded, verbose=0)
28
+ pred = tf.squeeze(tf.argmax(pred, axis=-1)).numpy()
29
+ generated_text += vocab[pred]
30
+ return generated_text
31
+
32
+ sample = "M"
33
+ print(generate_text(model, encoder, sample, 100))
34
+ ```
35
+
36
+ ```
37
+ MQLUS:
38
+ I will be so that the street of the state,
39
+ And then the street of the street of the state,
40
+ And
41
+ ```
42
 
43
  ## Training and evaluation data
44
 
45
+ [![https://example.com](https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg)](https://wandb.ai/adamelliotfields/shakespeare)
46
 
47
  ## Training procedure
48
 
49
+ The dataset consists of various works of William Shakespeare concatenated into a single file. The resulting file consists of individual speeches separated by `\n\n`.
50
 
51
+ The tokenizer is a Keras `TextVectorization` preprocessor that uses a simple character-based vocabulary.
52
+
53
+ To construct the training set, `100` characters are taken with the next character used as the target. This is repeated for each character in the text and results in **1,115,294** shuffled training examples.
54
+
55
+ *TODO: upload encoder*
56
+
57
+ ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
 
 
 
58
 
59
+ | Hyperparameters | Value |
60
+ | :---------------- | :-------- |
61
+ | `epochs` | `18` |
62
+ | `batch_size` | `1024` |
63
+ | `optimizer` | `AdamW` |
64
+ | `weight_decay` | `0.001` |
65
+ | `learning_rate` | `0.00025` |
66
 
67
  ## Model Plot
68