What does the tokenization for fill-in-the-middle requests look like?

by XeIaso - opened May 29

May 29

I'm looking at messing around with the fill-in-the-middle support for codestral but I can't figure out what I'd do to use it. I see that there's a FIMRequest class, but I want to know what tokens I should use for it with llama.cpp.

Thanks for making these models! They're a lot of fun to use personally and professionally.

Vokturz

May 29

based on mistralai/mistral-common/tokens/tokenizers/sentencepiece.py#L335 and mistralai/mistral-common/tokens/tokenizers/base.py#L10 the prompt should be like

<s>[SUFFIX]suffix_code[PREFIX]prefix_code

with the bos token </s> as stopping condition. However, I also see a [MIDDLE] token which isn't used, maybe I'm forgetting something?

aseichter2007

Jun 6

My expectation is that middle is used as the last token before the generated response.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment