[merged experimental] Potential Linux support.
OSError: [Errno 8] Exec format error: './bin/quantize.exe'
When running in Linux
Simple python script (gguf-imat.py - I recommend using the specific "for-FP16" or "for-BF16" scripts) to generate various GGUF-IQ-Imatrix quantizations from a Hugging Face author/model input (...)
(...) for Windows and NVIDIA hardware.
This script currently only supports Windows 10/11, the .exe format being the Windows executable (binary), which is fetched from the latest release by llama.cpp
, from the llama-b3145-bin-win-cuda-cu12.2.0-x64.zip
file.
Considering llama.cpp
now provides Linux binaries under the name of llama-b3145-bin-ubuntu-x64.zip
in their new releases, this could be adapted/ported by someone who uses Linux and is interested in doing so.
I'm not sure how/if GPU/NVIDIA Cuda would work out of the box with this, that's what I use for the imatrix generation.
I have added support for linux, however you will need to compile locally. Should I add a duplicate .py or merge changes with existing file.
I didn't do much, just added as check with platform.system() and some if and elif statements with updated commands for linux.
Just made it a separate file, since I don't know if I broke anything on windows.
@Virt-io - Thanks for the commit!