eformat
/

granite-3.0-8b-instruct-Q4_K_M-GGUF

Text Generation

Model card Files Files and versions Community

granite-3.0-8b-instruct-Q4_K_M-GGUF / README.md

eformat's picture

Update README.md

b87ae9a verified about 1 month ago

|

history blame contribute delete

866 Bytes

	---
	pipeline_tag: text-generation
	inference: false
	license: apache-2.0
	library_name: transformers
	tags:
	- language
	- granite-3.0
	- llama-cpp
	- gguf-my-repo
	base_model: ibm-granite/granite-3.0-8b-instruct

	---

	# eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF

	Not all tools (vllm, llama.cpp) seem to support the new model config params it seems (25/10/2024).

	```json
	# config.json
	"model_type": "granite"
	"architectures": [
	"GraniteForCausalLM"
	]
	```

	This gguf conversion done using old ones

	```json
	# config.json
	"model_type": "llama"
	"architectures": [
	"LlamaForCausalLM"
	]
	```

	This gguf loads OK - tested using:

	```bash
	# llama.cpp
	./llama-server --verbose --gpu-layers 99999 --parallel 2 --ctx-size 4096 -m ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
	```

	```bash
	# vllm
	vllm serve ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
	```