kaitchup
/

Llama-3-8B-4bit-AutoRound-GPTQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Llama-3-8B-4bit-AutoRound-GPTQ / README.md

bnjmnmarie's picture

Upload README.md with huggingface_hub

f176d9b verified 3 months ago

|

history blame contribute delete

No virus

1.09 kB

	---
	language:
	- en
	library_name: transformers
	license: cc-by-4.0
	tags:
	- auto-gptq
	- AutoRound
	extra_gated_prompt: This model is exclusively available to paid subscribers of [The
	Kaitchup](https://kaitchup.substack.com/). To gain access, [subscribe to The Kaitchup](https://kaitchup.substack.com/)
	for either a monthly or yearly paid plan. Once subscribed, you will receive an access
	token by email and will have access to all the models listed on [this page](https://kaitchup.substack.com/p/models).
	---


	## Model Details

	This is [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) quantized with AutoRound and serialized with the GPTQ format in 4-bit. The model has been created, tested, and evaluated by The Kaitchup.

	Details on the AutoRound quantization process and how to use the model here:
	[Intel AutoRound: Accurate Low-bit Quantization for LLMs](https://kaitchup.substack.com/p/intel-autoround-accurate-low-bit)


	- Developed by: [The Kaitchup](https://kaitchup.substack.com/)
	- Language(s) (NLP): English
	- License: cc-by-4.0