GTPQ, AWQ, GGUF version request
I'm having a hard time finding the hw to do multiple quants for 70b models. I ll try to get it done later today
Shouldn't it have a model card first?
Shouldn't it have a model card first?
It is top1 on leaderboard now: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
https://huggingface.co/datasets/open-llm-leaderboard/details_ICBU-NPU__FashionGPT-70B-V1.2
That doesn't mean it shouldn't have a model card
I'm having a hard time finding the hw to do multiple quants for 70b models. I ll try to get it done later today
So maybe start with some most popular quants? As for me I'm mostly need ONLY GGUF 4KM for 70B as it's the right size to fit into one 48Gb card or two 24Gb
That doesn't mean it shouldn't have a model card
How are quantization proccess related to the model card?
You can see card from version 1.1: https://huggingface.co/ICBU-NPU/FashionGPT-70B-V1.1
There are small difference.
Quants for this model are starting now
Hmm, looks like prompt format from v1.1 do not work with v1.2 properly :(
Just fond another 70b model steals the #1 on the openllm leaderboard just now https://huggingface.co/ValiantLabs/ShiningValiant/tree/main. Shame it does not have a discussion section, so I have to place the GPTQ request here.
at least it has a model card
Just fond another 70b model steals the #1 on the openllm leaderboard just now https://huggingface.co/ValiantLabs/ShiningValiant/tree/main. Shame it does not have a discussion section, so I have to place the GPTQ request here.