Request: meta-llama/Meta-Llama-3-8B-Instruct
Model name: meta-llama/Meta-Llama-3-8B-Instruct
[Required] Model link: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
[Required] Brief description: Solely for testing purposes, i recently ran MMLU-Pro on mradermacher's Llama3-8B-Instruct, I-matrix and non I-matrix quants and found that the i-matrix quants performed worse in the test by 5.7%. I'd like another quant to compare against to make sure it isn't just an anomaly.
Discussion & Results: https://huggingface.co/mradermacher/Meta-Llama-3-8B-Instruct-i1-GGUF/discussions/1
[Required] An image/direct image link to represent the model (square shaped):
If i must πΆβπ«οΈ
[Optional] Additonal quants (if you want any):
I only plan to use a Q5_K_M πΈ
Sure thing.
I'll upload the Q5_K_M as soon as it's ready for your testing.
Conversion:
- HF Model in BF16 =(convert_hf_to_gguf.py)=> BF16-GGUF
- HF Model in BF16 =(convert_hf_to_gguf.py)=> FP16-GGUF
- Generate imatrix.dat (llama-imatrix.exe) from FP16-GGUF (source data is the imatrix-with-rp-ex.txt)
- Quantize (llama-quantize.exe) BF16-GGUF with the imatrix.dat to the other smaller sizes
Update:
File is up.
Will add the Q4 quants as well, if you need to test those, the differences between non-imatrix and imatrix quants should be more pronounced there.