--- license: cc-by-nc-4.0 tags: - not-for-all-audiences - nsfw --- ## Exl2 version of [Undi95/Nethena-MLewd-Xwin-23B](https://huggingface.co/Undi95/Nethena-MLewd-Xwin-23B) ## branch [main](https://huggingface.co/IHaBiS/Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2-pippa/tree/main) : 3.8bpw h8 I checked that main branch runs on **24G GPU** (tested on Runpod 3090 server) This time I quantized it with pippa parquet made by Undi95, testing for differences between this dataset and wikitext. I hope this version give better result than wikitext version I did before. Quantization settings : python convert.py -i models/Undi95_Nethena-MLewd-Xwin-23B -o Nethena-MLewd-Xwin-23B-temp -cf Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2 -c pippa.parquet -l 4096 -b 3.8 -hb 8 -ml 4096 ### below this line is original readme Undi doing chemistry again. Layer of Xwin-Mlewd was added in a different way than I do before, result seem good, but I'm a VRAMlet so I can only run the Q2 at 2k context for now. Need to see if it really work good or I was just lucky with my prompt. OG model : [NeverSleep/Nethena-13B](https://huggingface.co/NeverSleep/Nethena-13B) ## Prompt template: Alpaca ``` Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {prompt} ### Response: ``` LimaRP is always kicking in and thus, this can be used to have more control on the size of the output. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/2ftpUX9khVcddk-VGEH3p.png) Thanks Ikari.