Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantized version of this: https://huggingface.co/ausboss/llama-30b-supercot

GPTQ quantization using https://github.com/0cc4m/GPTQ-for-LLaMa for compatibility with 0cc4m's fork of KoboldAI

Command used to quantize:
python llama.py c:\llama-30b-supercot c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors

Evaluation & Score (Lower is better):

  • WikiText2: 4.51
  • PTB: 17.46
  • C4: 6.37

Non-groupsize version is here: https://huggingface.co/tsumeone/llama-30b-supercot-4bit-cuda

Downloads last month
18
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.