Update tokenizer
#11
by
RaymondLi
- opened
Add the <|endoftext|>
and the FIM special tokens to the tokenizer.
Users that were adding these tokens a posteriori to do infilling would need to remove that part from their code.
RaymondLi
changed pull request status to
open
Thanks for adding this! We should then update SantaCoder demo when this is merged and maybe post it in Slack
loubnabnl
changed pull request status to
merged