Text Generation
Transformers
PyTorch
code
gpt2
custom_code
Eval Results
text-generation-inference
Inference Endpoints

Update tokenizer

#11
by RaymondLi - opened

Add the <|endoftext|> and the FIM special tokens to the tokenizer.

Users that were adding these tokens a posteriori to do infilling would need to remove that part from their code.

RaymondLi changed pull request status to open
BigCode org

Thanks for adding this! We should then update SantaCoder demo when this is merged and maybe post it in Slack

loubnabnl changed pull request status to merged

Sign up or log in to comment