Please make the fp16 version of this model

#2
by rombodawg - opened

Panchovix version doesnt work properly. It wont output anything in text gen web ui. With "Truncate the prompt up to this length" set above 2k it will only output 1 token, but at 2k it will repeat tokens. So his model is basically broken

might want to improve the title of the thread so that more closely matches your message. the title simply requests for a fp16 version, but your message indicates a problem with this version of the model.

Panchovix's versions don't include the automatic monkey patch that I include with my SuperHOT fp16's. You could apply it manually fairly easily:

  1. Grab the two .py files from any of my fp16 SuperHOT models and save them with the rest of the model files
  2. Edit config.json for Panchovix's fp16, and add this:
  "auto_map": {
    "AutoModel": "modelling_llama.LlamaModel",
    "AutoModelForCausalLM": "modelling_llama.LlamaForCausalLM",
    "AutoModelForSequenceClassification": "modelling_llama.LlamaForSequenceClassification"
  },
  1. Now load the model with trust_remote_code=True

Test again with that, it should work.

Thank you! I will test this out and post the results here

Ok your fix worked! Thank you

Sign up or log in to comment