Jokers
Collection
Models for jokes generation
•
4 items
•
Updated
This model is a fine-tuned version of gpt2 on the baneks dataset for 1 epoch. It achieved 1.9752
loss during training.
Model evaluation has not been performed.
The model is a fine-tuned variant of the base gpt2 architecture with causal language modeling head.
The model should be used for studying abilities of natural language models to generate jokes.
The model is trained on a list of anecdotes pulled from a few vk communities (see baneks dataset for more details).
The following hyperparameters were used during training:
{
'name': 'AdamWeightDecay',
'learning_rate': {
'module': 'transformers.optimization_tf',
'class_name': 'WarmUp',
'config': {
'initial_learning_rate': 5e-05,
'decay_schedule_fn': {
'module': 'keras.optimizers.schedules',
'class_name': 'PolynomialDecay',
'config': {
'initial_learning_rate': 5e-05,
'decay_steps': 28462,
'end_learning_rate': 0.0,
'power': 1.0,
'cycle': False,
'name': None
},
'registered_name': None
},
'warmup_steps': 1000,
'power': 1.0,
'name': None
},
'registered_name': 'WarmUp'
},
'decay': 0.0,
'beta_1': 0.9,
'beta_2': 0.999,
'epsilon': 1e-08,
'amsgrad': False,
'weight_decay_rate': 0.01
}
mixed_float16
Train Loss | Epoch |
---|---|
1.9752 | 0 |
Base model
openai-community/gpt2