How to convert that to ggml?
Hello, this model looks great. Thanks. I was wondering how you convert the original model to ggml. I am trying to use Qlora to fine tune a model based on that and run it on my Mac. Is this possible? Thanks
You have to merge base model with fine tuned lora version, than based on that structure you can convert it to ggml format
You have to merge base model with fine tuned lora version, than based on that structure you can convert it to ggml format
Thank you for providing a useful description of the combination of basic models with fine -tuning LoRa version and converting it to GGML format. I have solved the problem
sorry, I just was in a rush, so the answer is not really detailed at all.
To merge and unload:
model = PeftModel.from_pretrained(base_model, args.peft_model_path, **device_arg)
model = model.merge_and_unload()
where base_model is path to original implementation and peft_model path is a directory to your Lora tuned weights.
Then, you have to save it (together with tokenizer)
model.save_pretrained(f"{args.output_dir}")
and you are ready to convert it to ggml.
to convert it I am using llama.cpp
https://github.com/ggerganov/llama.cpp
In simplest case you can use lora_to_ggml.py with properly defined arguments
python lora_to_ggml.py -m cached_model_path -l llama.cpp_path -i output_path
Thank you. I basically use the same method. I use the convert-pth-to-ggml.py to convert the model to ggml-model-f32.bin and then quantize it. Basically follow the instructions provided in llama.cpp. Thanks. π
Using Colab
Convert the model to ggml-model-f32.bin:
!python3 convert-pth-to-ggml.py models/model-path/ 0
Quantize the model:
!./quantize ./models/model-path/ggml-model-f32.bin ./models/model-path/ggml-model-q4_0.bin q4_0
Thats great ti hear! Cool!