Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a 4bit quant of https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b

My secret sauce:

  • Using comit 3c16fd9 of 0cc4m's GPTQ fork
  • Using PTB as the calibration dataset
  • Act-order, True-sequential, percdamp 0.1 (the default percdamp is 0.01)
  • No groupsize
  • Will run with CUDA, does not need triton.
  • Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.

Benchmark results

Model C4 WikiText2 PTB
Aeala's FP16 7.05504846572876 4.662261962890625 24.547462463378906
This Quant 7.326207160949707 4.957101345062256 24.941526412963867
Aeala's Quant here 7.332120418548584 5.016242980957031 25.576189041137695
Downloads last month
21
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.