adamelliotfields
/

shakespeare

Text Generation

Model card Files Files and versions Community

shakespeare / README.md

adamelliotfields's picture

adamelliotfields

Update readme

ab5cc48 verified 4 months ago

|

history blame contribute delete

2.3 kB

	---
	datasets:
	- karpathy/tiny_shakespeare
	library_name: tf-keras
	license: mit
	metrics:
	- accuracy
	pipeline_tag: text-generation
	tags:
	- lstm
	---

	## Model description

	LSTM trained on Andrej Karpathy's [`tiny_shakespeare`](https://huggingface.co/datasets/karpathy/tiny_shakespeare) dataset, from his blog post, [The Unreasonable Effectiveness of Recurrent Neural Networks](https://karpathy.github.io/2015/05/21/rnn-effectiveness/).

	Made to experiment with Hugging Face and W&B.

	## Intended uses & limitations

	The model predicts the next character based on a variable-length input sequence. After `18` epochs of training, the model is generating text that is somewhat coherent.

	```py
	def generate_text(model, encoder, text, n):
	vocab = encoder.get_vocabulary()
	generated_text = text
	for _ in range(n):
	encoded = encoder([generated_text])
	pred = model.predict(encoded, verbose=0)
	pred = tf.squeeze(tf.argmax(pred, axis=-1)).numpy()
	generated_text += vocab[pred]
	return generated_text

	sample = "M"
	print(generate_text(model, encoder, sample, 100))
	```

	```
	MQLUS:
	I will be so that the street of the state,
	And then the street of the street of the state,
	And
	```

	## Training and evaluation data

	[![https://example.com](https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg)](https://wandb.ai/adamelliotfields/shakespeare)

	## Training procedure

	The dataset consists of various works of William Shakespeare concatenated into a single file. The resulting file consists of individual speeches separated by `\n\n`.

	The tokenizer is a Keras `TextVectorization` preprocessor that uses a simple character-based vocabulary.

	To construct the training set, `100` characters are taken with the next character used as the target. This is repeated for each character in the text and results in 1,115,294 shuffled training examples.

	TODO: upload encoder

	### Training hyperparameters

	\| Hyperparameters \| Value \|
	\| :---------------- \| :-------- \|
	\| `epochs` \| `18` \|
	\| `batch_size` \| `1024` \|
	\| `optimizer` \| `AdamW` \|
	\| `weight_decay` \| `0.001` \|
	\| `learning_rate` \| `0.00025` \|

	## Model Plot

	<details>
	<summary>View Model Plot</summary>

	![Model Image](./model.png)

	</details>