Update config.json for flan-t5-small
Browse filesI believe the num_heads and num_layers values are swapped for google/flan-t5-small. See the comparison for t5-small (link below) which flan-t5-small is based off. With the current values, the hidden size of the model isn't divisible by the number of attention heads (512 % 6 = 2).
https://huggingface.co/t5-small/blob/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/config.json#L16
- config.json +2 -2
config.json
CHANGED
@@ -15,8 +15,8 @@
|
|
15 |
"model_type": "t5",
|
16 |
"n_positions": 512,
|
17 |
"num_decoder_layers": 8,
|
18 |
-
"num_heads":
|
19 |
-
"num_layers":
|
20 |
"output_past": true,
|
21 |
"pad_token_id": 0,
|
22 |
"relative_attention_max_distance": 128,
|
|
|
15 |
"model_type": "t5",
|
16 |
"n_positions": 512,
|
17 |
"num_decoder_layers": 8,
|
18 |
+
"num_heads": 8,
|
19 |
+
"num_layers": 6,
|
20 |
"output_past": true,
|
21 |
"pad_token_id": 0,
|
22 |
"relative_attention_max_distance": 128,
|