TheBloke/deepseek-coder-6.7B-base-GGUF · infill example crashes llama.cpp

I can't seem to get infill to work. The prompt is from the infill example in the original readme.

./main -m models/deepseek-coder-6.7b-base.Q8_0.gguf --verbose-prompt -p '<｜fim▁begin｜>def quick_sort(arr):        
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    left = []
    right = []
<｜fim▁hole｜>
        if arr[i] < pivot:
            left.append(arr[i])
        else:
            right.append(arr[i])
    return quick_sort(left) + [pivot] + quick_sort(right)<｜fim▁end｜>'

this gives

[omitted]
system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | 
libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
[1]    4796 abort      ./main -m models/deepseek-coder-6.7b-base.Q8_0.gguf --verbose-prompt -p

without the verbose prompt, so I guess something's wrong during tokenization. It works without these three fim_begin/hole/end tokens.