AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'
base_model = "THUDM/chatglm2-6b"
---> tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
~/.cache/huggingface/modules/transformers_modules/THUDM/chatglm2-6b/8fd7fba285f7171d3ae7ea3b35c53b6340501ed1/tokenization_chatglm.py in vocab_size(self)
106 @property
107 def vocab_size(self):
--> 108 return self.tokenizer.n_words
109
110 def get_vocab(self):
AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'
This problem seem to only appear today and was totally fine.
Much appreciation if someone could verify the issue.
I'm on transformers-4.34.0
Reverting to transformers==4.33.0
resolved the problem.
Loading checkpoint shards becomes very slow, do you encounter this issue?
Loading checkpoint shards becomes very slow, do you encounter this issue?
I've just started using this model for a week or so, not sure if it's getting slower than before, but I do found it's relatively slow when being loaded from Google Drive into a Colab Notebook.
This might because the code is not compatible with huggingface class and method defs.
moving line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.
moving line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.
Emm, I can't find this code in tokenization_chatglm.py