Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Do bloom models use <s> and </s> tokens?

#274
by abuelnasr - opened

bloom tokenizer have bos_token = <s> and eos_token = </s>, but they are not actually used by the tokenizer to wrap the input.
https://huggingface.co/docs/transformers/model_doc/bloom#transformers.BloomTokenizerFast

Is that a bug or bloom model doesn't use these special tokens and didn't use them during training. and if bloom doesn't use them then what is the purpose of having them in the tokenizer?

Sign up or log in to comment