Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a 4bit quant of https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b

My secret sauce:

  • Using comit 3c16fd9 of 0cc4m's GPTQ fork
  • Using C4 as the calibration dataset
  • Act-order, True-sequential, percdamp 0.1 (the default percdamp is 0.01)
  • No groupsize
  • Will run with CUDA, does not need triton.
  • Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.

Benchmark results

Model C4 WikiText2 PTB
MetaIX's FP16 6.98400259 4.607768536 9.414786339
This Quant 7.292364597 4.954069614 9.754593849
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.