How to split a model
#6
by
nib12345
- opened
Hi guys,
Does anyone have idea?
How to split
mpt-30b-chat.ggmlv0.q4_1.bin
to
mpt-30b-chat.ggmlv0.q4_1_00001_of_00004.bin
mpt-30b-chat.ggmlv0.q4_1_00002_of_00004.bin
mpt-30b-chat.ggmlv0.q4_1_00003_of_00004.bin
mpt-30b-chat.ggmlv0.q4_1_00004_of_00004.bin
As to load on kaggle (as kaggle has limitation of ram).
If anyone has idea, please say?
I dont know much about how to develop model, i am just a Full Stack Developer.
Thanks.
That's not possible. GGML does not support multi-part GGML files.
Using KoboldCpp you can offload some of the model to GPU (if you have one), which will reduce RAM usage accordingly.
But there's no GPU support for MPT GGML models from Python code at this time. Only using the KoboldCpp UI.
nib12345
changed discussion status to
closed