Why the Decoder_start_token_id has to be defined in "google/switch-base-32" or "google/switch-base-64" ??
#5
by
Karim-Gamal
- opened
I was trying the official notebook ( https://colab.research.google.com/drive/1aGGVHZmtKmcNBbAwa9hbu58DDpIuB5O4?usp=sharing ) for the MoE Switch transformer but with "base-32".
ValueError: self.model.config.decoder_start_token_id has to be defined. In SwitchTransformers it is usually set to the pad_token_id.
However i don't face this problem with "base-8" or "base-16".
Thanks for the report. This is a bug as the decoder_start_token_id
has been forgotten for this model. This should be now fixed: https://huggingface.co/google/switch-base-32/blob/main/config.json#L19
Thanks for your quick response. π