Suggestion: NaN logits when padding used

#20

by WillBankes - opened Aug 29

Aug 29

Hi, this is a really interesting model. Had fun playing around with it.

I've come across the following issue which I thought would be good to raise here. The nan issues described in this thread: https://github.com/huggingface/transformers/issues/32390 is also an issue with this model when the inputs are padded.

I found updating the code example from the model card, changing the torch.bfloat16 to torch.float16 fixed this issue for me.

GopiUppari

Google org Sep 3

Hi @WillBankes ,

I executed the code that was reported to cause NaN issues when using padding, as described in this GitHub thread https://github.com/huggingface/transformers/issues/32390 . However, I didn't encounter the NaN logits issue with the model google/shieldgemma-2b. You can refer to the detailed execution in this Colab notebook https://colab.research.google.com/gist/Gopi-Uppari/ffe907c215f0ebfdfb16e1f173c54942/nan-logits-when-padding-used.ipynb .

Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment