license: apache-2.0 | |
language: | |
- en | |
pipeline_tag: text-generation | |
tags: | |
- AutoGPTQ | |
- 4bit | |
- GPTQ | |
Model created using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) on a [GPT-2](https://huggingface.co/gpt2) model with 4-bit quantization. | |
You can load this model with the AutoGPTQ library, installed with the following command: | |
``` | |
pip install auto-gptq | |
``` | |
You can then download the model from the hub using the following code: | |
```python | |
from transformers import AutoModelForCausalLM, AutoTokenizer | |
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig | |
model_name = "mlabonne/gpt2-GPTQ-4bit" | |
tokenizer = AutoTokenizer.from_pretrained(model_name) | |
quantize_config = BaseQuantizeConfig.from_pretrained(model_name) | |
model = AutoGPTQForCausalLM.from_quantized(model_name, | |
model_basename="gptq_model-4bit-128g", | |
device="cuda:0", | |
use_triton=True, | |
use_safetensors=True, | |
quantize_config=quantize_config) | |
``` | |
This model works with the traditional [Text Generation pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.TextGenerationPipeline). |