|
--- |
|
language: |
|
- ru |
|
- en |
|
license: apache-2.0 |
|
base_model: gpt2 |
|
tags: |
|
- not-for-all-audiences |
|
- art |
|
- humour |
|
- jokes |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: zeio/fool |
|
results: [] |
|
datasets: |
|
- zeio/baneks |
|
metrics: |
|
- loss |
|
widget: |
|
- text: 'Купил мужик шляпу' |
|
example_title: hat |
|
- text: 'Пришла бабка к врачу' |
|
example_title: doctor |
|
- text: 'Нашел мужик подкову' |
|
example_title: horseshoe |
|
--- |
|
|
|
# fool |
|
|
|
This model is a fine-tuned version of [gpt2][gpt2] on the [baneks][baneks] dataset for 1 epoch. It achieved `1.9752` loss during training. |
|
Model evaluation has not been performed. |
|
|
|
## Model description |
|
|
|
The model is a fine-tuned variant of the base [gpt2][gpt2] architecture with causal language modeling head. |
|
|
|
## Intended uses & limitations |
|
|
|
The model should be used for studying abilities of natural language models to generate jokes. |
|
|
|
## Training and evaluation data |
|
|
|
The model is trained on a list of anecdotes pulled from a few vk communities (see [baneks][baneks] dataset for more details). |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- optimizer: |
|
```json |
|
{ |
|
'name': 'AdamWeightDecay', |
|
'learning_rate': { |
|
'module': 'transformers.optimization_tf', |
|
'class_name': 'WarmUp', |
|
'config': { |
|
'initial_learning_rate': 5e-05, |
|
'decay_schedule_fn': { |
|
'module': 'keras.optimizers.schedules', |
|
'class_name': 'PolynomialDecay', |
|
'config': { |
|
'initial_learning_rate': 5e-05, |
|
'decay_steps': 28462, |
|
'end_learning_rate': 0.0, |
|
'power': 1.0, |
|
'cycle': False, |
|
'name': None |
|
}, |
|
'registered_name': None |
|
}, |
|
'warmup_steps': 1000, |
|
'power': 1.0, |
|
'name': None |
|
}, |
|
'registered_name': 'WarmUp' |
|
}, |
|
'decay': 0.0, |
|
'beta_1': 0.9, |
|
'beta_2': 0.999, |
|
'epsilon': 1e-08, |
|
'amsgrad': False, |
|
'weight_decay_rate': 0.01 |
|
} |
|
``` |
|
- training_precision: `mixed_float16` |
|
|
|
### Training results |
|
|
|
| Train Loss | Epoch | |
|
|:----------:|:-----:| |
|
| 1.9752 | 0 | |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.35.0.dev0 |
|
- TensorFlow 2.14.0 |
|
- Datasets 2.12.0 |
|
- Tokenizers 0.14.1 |
|
|
|
[baneks]: https://huggingface.co/datasets/zeio/baneks |
|
[gpt2]: https://huggingface.co/gpt2 |
|
|