TheBloke/Guanaco-33B-SuperHOT-8K-GPTQ · Please make the fp16 version of this model

Jul 4, 2023

Panchovix version doesnt work properly. It wont output anything in text gen web ui. With "Truncate the prompt up to this length" set above 2k it will only output 1 token, but at 2k it will repeat tokens. So his model is basically broken

TheStamp

Jul 4, 2023

might want to improve the title of the thread so that more closely matches your message. the title simply requests for a fp16 version, but your message indicates a problem with this version of the model.

TheBloke

Owner Jul 4, 2023

Panchovix's versions don't include the automatic monkey patch that I include with my SuperHOT fp16's. You could apply it manually fairly easily:

Grab the two .py files from any of my fp16 SuperHOT models and save them with the rest of the model files
Edit config.json for Panchovix's fp16, and add this:

  "auto_map": {
    "AutoModel": "modelling_llama.LlamaModel",
    "AutoModelForCausalLM": "modelling_llama.LlamaForCausalLM",
    "AutoModelForSequenceClassification": "modelling_llama.LlamaForSequenceClassification"
  },

Now load the model with trust_remote_code=True

Test again with that, it should work.

rombodawg

Jul 4, 2023

Thank you! I will test this out and post the results here

rombodawg

Jul 4, 2023

Ok your fix worked! Thank you