Hello. Thanks for your model. I tried to make a ggml version but encountered this error:
Loading model file f3\pytorch_model.bin
vocabtype: spm
Loading vocab file f3\tokenizer.model
params: n_vocab:64256 n_embd:3200 n_mult:4320 n_head:32 n_layer:26
Traceback (most recent call last):
File "C:\kcp\convert.py", line 1325, in
main()
File "C:\kcp\convert.py", line 1320, in main
OutputFile.write_all(outfile, params, output_type, model, vocab)
File "C:\kcp\convert.py", line 1103, in write_all
check_vocab_size(params, vocab)
File "C:\kcp\convert.py", line 1058, in check_vocab_size
raise Exception(msg)
Exception: Vocab size mismatch (model has 64256, but f3\tokenizer.model has 32000)
I'd like to use this too. Too bad the readme-file is empty, so there are no instructions.
Sorry for the delayed answer. Previously the repository contained old wrong tokenizer with vocab mismatch. Can you try the ggml conversion again?
Readme-file is still empty as these new llama-finnish models are not yet officially released. Official release with proper readme-files is coming in a couple weeks time.
Thanks for the cool model! Can you give a hint of the prompt template before the official documentation is added, can't wait to get test this one.
Readme has now been added. For prompt template, there is no template because this model is pretrained model without any fine-tuning. Thus, this model is not good at following any instructions like ChatGPT, and should be fine-tuned to get better results.