tokenizer / tokenizer_config.json
thomasw21
Add tokenizer
8943cc8
raw
history blame
No virus
479 Bytes
{"unk_token": "<unk>", "eos_token": "</s>", "bos_token": "<s>", "pad_token": "<pad>", "special_tokens_map_file": "/Users/thomas/.cache/huggingface/transformers/9b8b2f4cb97dda0753c9b7213ca10bae9674703a4c64f786341b96a260d44985.9d6cd81ef646692fb1c169a880161ea1cb95f49694f220aced9b704b457e51dd", "name_or_path": "bigscience-catalogue-data-dev/byte-level-bpe-tokenizer-no-norm-250k-whitespace-and-eos-regex-alpha-v3-dedup-lines-articles", "tokenizer_class": "PreTrainedTokenizerFast"}