Suggestion: NaN logits when padding used

#20
by WillBankes - opened

Hi, this is a really interesting model. Had fun playing around with it.

I've come across the following issue which I thought would be good to raise here. The nan issues described in this thread: https://github.com/huggingface/transformers/issues/32390 is also an issue with this model when the inputs are padded.

I found updating the code example from the model card, changing the torch.bfloat16 to torch.float16 fixed this issue for me.

Google org

Hi @WillBankes ,

I executed the code that was reported to cause NaN issues when using padding, as described in this GitHub thread https://github.com/huggingface/transformers/issues/32390 . However, I didn't encounter the NaN logits issue with the model google/shieldgemma-2b. You can refer to the detailed execution in this Colab notebook https://colab.research.google.com/gist/Gopi-Uppari/ffe907c215f0ebfdfb16e1f173c54942/nan-logits-when-padding-used.ipynb .

Thank you.

Sign up or log in to comment