---
license: cc-by-nc-4.0
tags:
- not-for-all-audiences
- nsfw
---
## Exl2 version of [Undi95/Nethena-MLewd-Xwin-23B](https://huggingface.co/Undi95/Nethena-MLewd-Xwin-23B)  

## branch  
[main](https://huggingface.co/IHaBiS/Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2-pippa/tree/main) : 3.8bpw h8  

I checked that main branch runs on **24G GPU** (tested on Runpod 3090 server)  
This time I quantized it with pippa parquet made by Undi95, testing for differences between this dataset and wikitext.  
I hope this version give better result than wikitext version I did before.  
Quantization settings : python convert.py -i models/Undi95_Nethena-MLewd-Xwin-23B -o Nethena-MLewd-Xwin-23B-temp -cf Nethena-MLewd-Xwin-23B-3.8bpw-h8-exl2 -c pippa.parquet -l 4096 -b 3.8 -hb 8 -ml 4096  

### below this line is original readme  

Undi doing chemistry again.

Layer of Xwin-Mlewd was added in a different way than I do before, result seem good, but I'm a VRAMlet so I can only run the Q2 at 2k context for now.

Need to see if it really work good or I was just lucky with my prompt.

OG model : [NeverSleep/Nethena-13B](https://huggingface.co/NeverSleep/Nethena-13B)

## Prompt template: Alpaca

```
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:
```

LimaRP is always kicking in and thus, this can be used to have more control on the size of the output.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/2ftpUX9khVcddk-VGEH3p.png)

Thanks Ikari.