eos_token clarification

#1
by Starlento - opened

I found in tokenizer_config.json, it is a standard chatml template. But I checked the model it seems to use <|endoftext|> as eos_token for some cases.

Inference code:

messages = [
    {"role": "user", "content": "你好"}
]

input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, return_tensors='pt')
output_ids = text_model.generate(input_ids.to('cuda'), eos_token_id=tokenizer.eos_token_id, max_length=256)
response = tokenizer.decode(output_ids[0], skip_special_tokens=False)
# response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Result:

<|im_start|>user
你好<|im_end|> 
<|im_start|>assistant
你好!有什么我可以帮助你的吗?<|endoftext|>你好!有什么我可以帮助你的吗?

如果你有任何问题或需要信息,请随时告诉我!我在这里帮助你。<|im_end|>

But for "hi", it seems normal.

Could you kindly check this problem?

same problem

Hello, any update on this?

Hi 👋 We tried to reproduce the issue, but we didn't encounter any errors during inference. Could you please try again using this inference code: https://github.com/01-ai/Yi/blob/main/Cookbook/en/opensource/Inference/Inference_using_transformers.ipynb

Without running the model, you can also see the issue is a misconfiguration:

There's a mismatch between the models config.json and tokenizer_config.json chat template and eos_token.

My guess is the reason @Starlento example is returning inconsistent responses is because eos_token_id=tokenizer.eos_token_id is passed, but this should work and indicates the misconfiguration between config.json and tokenizer_config.json.

output_ids = text_model.generate(input_ids.to('cuda'), eos_token_id=tokenizer.eos_token_id, max_length=256)

Okay, @nazimali , I understand what you're saying. However, if you try replacing <|im_end|> with <|endoftext|>, you'll encounter an error. It might be better to keep it as it is for now.

Sign up or log in to comment