davelsphere commited on
Commit
be80064
1 Parent(s): dc04b68

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +176 -0
README.md ADDED
@@ -0,0 +1,176 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: false
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - language
8
+ - granite-3.0
9
+ - llama-cpp
10
+ - gguf-my-repo
11
+ base_model: ibm-granite/granite-3.0-3b-a800m-instruct
12
+ model-index:
13
+ - name: granite-3.0-2b-instruct
14
+ results:
15
+ - task:
16
+ type: text-generation
17
+ dataset:
18
+ name: IFEval
19
+ type: instruction-following
20
+ metrics:
21
+ - type: pass@1
22
+ value: 42.49
23
+ name: pass@1
24
+ - type: pass@1
25
+ value: 7.02
26
+ name: pass@1
27
+ - task:
28
+ type: text-generation
29
+ dataset:
30
+ name: AGI-Eval
31
+ type: human-exams
32
+ metrics:
33
+ - type: pass@1
34
+ value: 25.7
35
+ name: pass@1
36
+ - type: pass@1
37
+ value: 50.16
38
+ name: pass@1
39
+ - type: pass@1
40
+ value: 20.51
41
+ name: pass@1
42
+ - task:
43
+ type: text-generation
44
+ dataset:
45
+ name: OBQA
46
+ type: commonsense
47
+ metrics:
48
+ - type: pass@1
49
+ value: 40.8
50
+ name: pass@1
51
+ - type: pass@1
52
+ value: 59.95
53
+ name: pass@1
54
+ - type: pass@1
55
+ value: 71.86
56
+ name: pass@1
57
+ - type: pass@1
58
+ value: 67.01
59
+ name: pass@1
60
+ - type: pass@1
61
+ value: 48.0
62
+ name: pass@1
63
+ - task:
64
+ type: text-generation
65
+ dataset:
66
+ name: BoolQ
67
+ type: reading-comprehension
68
+ metrics:
69
+ - type: pass@1
70
+ value: 78.65
71
+ name: pass@1
72
+ - type: pass@1
73
+ value: 6.71
74
+ name: pass@1
75
+ - task:
76
+ type: text-generation
77
+ dataset:
78
+ name: ARC-C
79
+ type: reasoning
80
+ metrics:
81
+ - type: pass@1
82
+ value: 50.94
83
+ name: pass@1
84
+ - type: pass@1
85
+ value: 26.85
86
+ name: pass@1
87
+ - type: pass@1
88
+ value: 37.7
89
+ name: pass@1
90
+ - task:
91
+ type: text-generation
92
+ dataset:
93
+ name: HumanEvalSynthesis
94
+ type: code
95
+ metrics:
96
+ - type: pass@1
97
+ value: 39.63
98
+ name: pass@1
99
+ - type: pass@1
100
+ value: 40.85
101
+ name: pass@1
102
+ - type: pass@1
103
+ value: 35.98
104
+ name: pass@1
105
+ - type: pass@1
106
+ value: 27.4
107
+ name: pass@1
108
+ - task:
109
+ type: text-generation
110
+ dataset:
111
+ name: GSM8K
112
+ type: math
113
+ metrics:
114
+ - type: pass@1
115
+ value: 47.54
116
+ name: pass@1
117
+ - type: pass@1
118
+ value: 19.86
119
+ name: pass@1
120
+ - task:
121
+ type: text-generation
122
+ dataset:
123
+ name: PAWS-X (7 langs)
124
+ type: multilingual
125
+ metrics:
126
+ - type: pass@1
127
+ value: 50.23
128
+ name: pass@1
129
+ - type: pass@1
130
+ value: 28.87
131
+ name: pass@1
132
+ ---
133
+
134
+ # davelsphere/granite-3.0-3b-a800m-instruct-Q4_K_M-GGUF
135
+ This model was converted to GGUF format from [`ibm-granite/granite-3.0-3b-a800m-instruct`](https://huggingface.co/ibm-granite/granite-3.0-3b-a800m-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
136
+ Refer to the [original model card](https://huggingface.co/ibm-granite/granite-3.0-3b-a800m-instruct) for more details on the model.
137
+
138
+ ## Use with llama.cpp
139
+ Install llama.cpp through brew (works on Mac and Linux)
140
+
141
+ ```bash
142
+ brew install llama.cpp
143
+
144
+ ```
145
+ Invoke the llama.cpp server or the CLI.
146
+
147
+ ### CLI:
148
+ ```bash
149
+ llama-cli --hf-repo davelsphere/granite-3.0-3b-a800m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-instruct-q4_k_m.gguf -p "The meaning to life and the universe is"
150
+ ```
151
+
152
+ ### Server:
153
+ ```bash
154
+ llama-server --hf-repo davelsphere/granite-3.0-3b-a800m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-instruct-q4_k_m.gguf -c 2048
155
+ ```
156
+
157
+ Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
158
+
159
+ Step 1: Clone llama.cpp from GitHub.
160
+ ```
161
+ git clone https://github.com/ggerganov/llama.cpp
162
+ ```
163
+
164
+ Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
165
+ ```
166
+ cd llama.cpp && LLAMA_CURL=1 make
167
+ ```
168
+
169
+ Step 3: Run inference through the main binary.
170
+ ```
171
+ ./llama-cli --hf-repo davelsphere/granite-3.0-3b-a800m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-instruct-q4_k_m.gguf -p "The meaning to life and the universe is"
172
+ ```
173
+ or
174
+ ```
175
+ ./llama-server --hf-repo davelsphere/granite-3.0-3b-a800m-instruct-Q4_K_M-GGUF --hf-file granite-3.0-3b-a800m-instruct-q4_k_m.gguf -c 2048
176
+ ```