kaitchup
/

Llama-2-7b-gptq-2bit

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-2-7b-gptq-2bit / README.md

bnjmnmarie's picture

Create README.md

a8185f7 about 1 year ago

|

1.25 kB

	---
	license: apache-2.0
	language:
	- en
	---
	# Model Card for Model ID

	This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers.
	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [The Kaitchup](https://kaitchup.substack.com/)
	- Model type: Causal (Llama 2)
	- Language(s) (NLP): English
	- License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)

	### Model Sources

	The method and code used to quantize the model are explained here:
	[Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL](https://kaitchup.substack.com/p/quantize-and-fine-tune-llms-with)

	## Uses

	This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters.
	Note that the 2-bit quantization significantly decreases the performance of Llama 2.


	## Other versions

	- [kaitchup/Llama-2-7b-gptq-4bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-4bit)
	- [kaitchup/Llama-2-7b-gptq-3bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-3bit)




	## Model Card Contact

	[The Kaitchup](https://kaitchup.substack.com/)