eformat commited on
Commit
b87ae9a
1 Parent(s): 54ea565

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -12,4 +12,36 @@ base_model: ibm-granite/granite-3.0-8b-instruct
12
 
13
  ---
14
 
15
- # eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ---
14
 
15
+ # eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF
16
+
17
+ Not all tools (vllm, llama.cpp) seem to support the new model config params it seems (25/10/2024).
18
+
19
+ ```json
20
+ # config.json
21
+ "model_type": "granite"
22
+ "architectures": [
23
+ "GraniteForCausalLM"
24
+ ]
25
+ ```
26
+
27
+ This gguf conversion done using old ones
28
+
29
+ ```json
30
+ # config.json
31
+ "model_type": "llama"
32
+ "architectures": [
33
+ "LlamaForCausalLM"
34
+ ]
35
+ ```
36
+
37
+ This gguf loads OK - tested using:
38
+
39
+ ```bash
40
+ # llama.cpp
41
+ ./llama-server --verbose --gpu-layers 99999 --parallel 2 --ctx-size 4096 -m ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
42
+ ```
43
+
44
+ ```bash
45
+ # vllm
46
+ vllm serve ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
47
+ ```