Wrong Tokenizer?

by anas-awadalla - opened May 28, 2022

May 28, 2022

It seems like the tokenizer for this model is not correct as in I get very bad generation results if I use the tokenizer from OPT-30B model. In my case I am doing in-context learning evaluations and I get expected performance when I use the tokenizer from the 30B model. I am wondering if the tokenizer linked to this model is incorrect as I also see that the 30B model had a fix (https://huggingface.co/facebook/opt-30b/discussions/1) merged that I don't see for this model.

patrickvonplaten

May 30, 2022

Hey @anas-awadalla ,

Thanks a lot for you message!
Could you provide me with a code snippet that shows why you think the tokenizer is incorrect?

The following code of the README seems to work well for me: https://huggingface.co/facebook/opt-1.3b#how-to-use

anas-awadalla changed discussion status to closed Jun 20, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment