[AUTO] CVST Tokenizer Badger
#48
by
pandora-s
- opened
A scripted PR to update the status of the transformer tokenizer.
> [!IMPORTANT]
> β
> We recommend using `mistral_common` with the tokenizer v3 for most of the tokenization process as it follows closely our internal tokenization.
Can we maybe rephrase to:We recommend using
mistral_common for tokenization as the transformers tokenizer has not been tested by the Mistral team and might give incorrect results.
pandora-s
changed pull request status to
closed