File size: 866 Bytes
8df1080 b87ae9a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
---
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
tags:
- language
- granite-3.0
- llama-cpp
- gguf-my-repo
base_model: ibm-granite/granite-3.0-8b-instruct
---
# eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF
Not all tools (vllm, llama.cpp) seem to support the new model config params it seems (25/10/2024).
```json
# config.json
"model_type": "granite"
"architectures": [
"GraniteForCausalLM"
]
```
This gguf conversion done using old ones
```json
# config.json
"model_type": "llama"
"architectures": [
"LlamaForCausalLM"
]
```
This gguf loads OK - tested using:
```bash
# llama.cpp
./llama-server --verbose --gpu-layers 99999 --parallel 2 --ctx-size 4096 -m ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
```
```bash
# vllm
vllm serve ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
```
|