Edit model card

Llama 2 7B quantized with AutoGPTQ V0.3.0.

  • Group size: 32
  • Data type: INT4

This model is compatible with the first version of QA-LoRA.

To fine-tune it with QA-LoRA, follow this tutorial: Fine-tune Quantized Llama 2 on Your GPU with QA-LoRA

Downloads last month
53
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using kaitchup/Llama-2-7b-4bit-32g-autogptq 1