hidden_act is missing config.json

#9
by s-natsu - opened

thank you for great work!

diff /gemma-2-2b-jpn-it/config.json /gemma-2-2b-it/config.json 
10,11c10,13
<   "dtype": "bfloat16",
<   "eos_token_id": 1,
---
>   "eos_token_id": [
>     1,
>     107
>   ],
13a16
>   "hidden_act": "gelu_pytorch_tanh",
24c27
<   "query_pre_attn_scalar": 224,
---
>   "query_pre_attn_scalar": 256,
29c32
<   "transformers_version": "4.44.2",
---
>   "transformers_version": "4.42.4",

"hidden_act": "gelu_pytorch_tanh", cause problem.
I try to serve gemma-2-2b-jpn-it with vLLM, it raise Error.

   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/gemma2.py", line 421, in __init__
     self.model = Gemma2Model(config, cache_config, quant_config)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/gemma2.py", line 265, in __init__
     self.start_layer, self.end_layer, self.layers = make_layers(
                                                     ^^^^^^^^^^^^
   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 408, in make_layers
     maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/gemma2.py", line 267, in <lambda>
     lambda prefix: Gemma2DecoderLayer(int(prefix.split(".")[
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/gemma2.py", line 200, in __init__
     hidden_act=config.hidden_act,
                ^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.12/dist-packages/transformers/configuration_utils.py", line 202, in __getattribute__
     return super().__getattribute__(key)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 AttributeError: 'Gemma2Config' object has no attribute 'hidden_act'. Did you mean: 'hidden_size'?

I copy and paste hidden_act line from /gemma-2-2b-it/config.json it works.

s-natsu changed discussion title from config,json diff from gemma-2-2b-it to hidden_act is missing config.json
Google org

Hi @s-natsu ,

hidden_act is a legacy or deprecated parameter in some configurations, but it exists for backward compatibility with older versions of models or configurations. This parameter is overwritten by hidden_activation.
hidden_act and hidden_size are different. Because hidden_act defines defines the non-linear activation function used in the model and hidden_size defines dimensionality of the hidden layers.

For further information, could you please refer to this reference

Thank you.

Hi, @GopiUppari

As shown in the code below, we are passing config.hidden_act to the constructor of Gemma2MLP, but hidden_act is not defined in the config.json of google/gemma-2-2b-jpn-it.
https://github.com/vllm-project/vllm/blob/ad23318928d40ef7ac969451afa0dc198428c04b/vllm/model_executor/models/gemma2.py#L202

In other Gemma2 models, such as google/gemma-2-2b-it, hidden_act is defined in config.json, so no error occurs in vLLM. In this case, should we correct vLLM, or is it more appropriate to modify the model’s config.json?

Sign up or log in to comment