zeio commited on
Commit
ae3f482
1 Parent(s): 764d6fb

Updated readme

Browse files
Files changed (1) hide show
  1. README.md +65 -15
README.md CHANGED
@@ -1,42 +1,90 @@
1
  ---
2
- license: mit
 
 
 
3
  base_model: gpt2
4
  tags:
 
 
 
 
5
  - generated_from_keras_callback
6
  model-index:
7
  - name: zeio/fool
8
  results: []
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information Keras had access to. You should
12
- probably proofread and complete it, then remove this comment. -->
13
 
14
- # zeio/fool
15
-
16
- This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
17
- It achieves the following results on the evaluation set:
18
- - Train Loss: 1.9752
19
- - Epoch: 0
20
 
21
  ## Model description
22
 
23
- More information needed
24
 
25
  ## Intended uses & limitations
26
 
27
- More information needed
28
 
29
  ## Training and evaluation data
30
 
31
- More information needed
32
 
33
  ## Training procedure
34
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 28462, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
39
- - training_precision: mixed_float16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  ### Training results
42
 
@@ -44,10 +92,12 @@ The following hyperparameters were used during training:
44
  |:----------:|:-----:|
45
  | 1.9752 | 0 |
46
 
47
-
48
  ### Framework versions
49
 
50
  - Transformers 4.35.0.dev0
51
  - TensorFlow 2.14.0
52
  - Datasets 2.12.0
53
  - Tokenizers 0.14.1
 
 
 
 
1
  ---
2
+ language:
3
+ - ru
4
+ - en
5
+ license: apache-2.0
6
  base_model: gpt2
7
  tags:
8
+ - not-for-all-audiences
9
+ - art
10
+ - humour
11
+ - jokes
12
  - generated_from_keras_callback
13
  model-index:
14
  - name: zeio/fool
15
  results: []
16
+ datasets:
17
+ - zeio/baneks
18
+ metrics:
19
+ - loss
20
+ widget:
21
+ - text: 'Купил мужик шляпу'
22
+ example_title: hat
23
+ - text: 'Пришла бабка к врачу'
24
+ example_title: doctor
25
+ - text: 'Нашел мужик подкову'
26
+ example_title: horseshoe
27
  ---
28
 
29
+ # fool
 
30
 
31
+ This model is a fine-tuned version of [gpt2][gpt2] on the [baneks][baneks] dataset for 1 epoch. It achieved `1.9752` loss during training.
32
+ Model evaluation has not been performed.
 
 
 
 
33
 
34
  ## Model description
35
 
36
+ The model is a fine-tuned variant of the base [gpt2][gpt2] architecture with causal language modeling head.
37
 
38
  ## Intended uses & limitations
39
 
40
+ The model should be used for studying abilities of natural language models to generate jokes.
41
 
42
  ## Training and evaluation data
43
 
44
+ The model is trained on a list of anecdotes pulled from a few vk communities (see [baneks][baneks] dataset for more details).
45
 
46
  ## Training procedure
47
 
48
  ### Training hyperparameters
49
 
50
  The following hyperparameters were used during training:
51
+ - optimizer:
52
+ ```json
53
+ {
54
+ 'name': 'AdamWeightDecay',
55
+ 'learning_rate': {
56
+ 'module': 'transformers.optimization_tf',
57
+ 'class_name': 'WarmUp',
58
+ 'config': {
59
+ 'initial_learning_rate': 5e-05,
60
+ 'decay_schedule_fn': {
61
+ 'module': 'keras.optimizers.schedules',
62
+ 'class_name': 'PolynomialDecay',
63
+ 'config': {
64
+ 'initial_learning_rate': 5e-05,
65
+ 'decay_steps': 28462,
66
+ 'end_learning_rate': 0.0,
67
+ 'power': 1.0,
68
+ 'cycle': False,
69
+ 'name': None
70
+ },
71
+ 'registered_name': None
72
+ },
73
+ 'warmup_steps': 1000,
74
+ 'power': 1.0,
75
+ 'name': None
76
+ },
77
+ 'registered_name': 'WarmUp'
78
+ },
79
+ 'decay': 0.0,
80
+ 'beta_1': 0.9,
81
+ 'beta_2': 0.999,
82
+ 'epsilon': 1e-08,
83
+ 'amsgrad': False,
84
+ 'weight_decay_rate': 0.01
85
+ }
86
+ ```
87
+ - training_precision: `mixed_float16`
88
 
89
  ### Training results
90
 
 
92
  |:----------:|:-----:|
93
  | 1.9752 | 0 |
94
 
 
95
  ### Framework versions
96
 
97
  - Transformers 4.35.0.dev0
98
  - TensorFlow 2.14.0
99
  - Datasets 2.12.0
100
  - Tokenizers 0.14.1
101
+
102
+ [baneks]: https://huggingface.co/datasets/zeio/baneks
103
+ [gpt2]: https://huggingface.co/gpt2