cgus commited on
Commit
ff63363
1 Parent(s): 7f3adc3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -7
README.md CHANGED
@@ -16,15 +16,19 @@ Based on original model: [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7
16
  Created by: [Qwen](https://huggingface.co/Qwen)
17
 
18
  ## Quants
19
- [4bpw h6 (main)](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/main)
20
- [4.25bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/4.25bpw-h6)
21
- [4.65bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/4.65bpw-h6)
22
- [5bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/5bpw-h6)
23
- [6bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/6bpw-h6)
24
- [8bpw h8](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/8bpw-h8)
 
 
25
 
26
  ## Quantization notes
27
- Made with Exllamav2 0.1.5 and the default dataset.
 
 
28
 
29
  ## How to run
30
 
 
16
  Created by: [Qwen](https://huggingface.co/Qwen)
17
 
18
  ## Quants
19
+ |Quant|VRAM/4k|VRAM/8k|VRAM/16k|VRAM/32k|
20
+ |:---|:---|:---|:---|:---|
21
+ |[4bpw h6 (main)](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/main) | 5.3GB | 5.6GB | 5.9GB | 6.8GB |
22
+ |[4.25bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/4.25bpw-h6) | 5.5GB | 5.8GB | 6.2GB | 7.1GB |
23
+ |[4.65bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/4.65bpw-h6) | 5.8GB | 6.1GB | 6.5GB | 7.3GB |
24
+ |[5bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/5bpw-h6) | 6GB | 6.4GB | 6.7GB | 7.7GB |
25
+ |[6bpw h6](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/6bpw-h6) | 6.8GB | 7.2GB | 7.5GB | 8.4GB |
26
+ |[8bpw h8](https://huggingface.co/cgus/Qwen2-7B-Instruct-abliterated-exl2/tree/8bpw-h8) | 8.2GB | 8.6GB | 8.9GB | 9.8GB |
27
 
28
  ## Quantization notes
29
+ Made with Exllamav2 0.1.5 and the default dataset.
30
+ Doesn't seem to work with 4 or 8bit cache with Exllamav2-0.1.5, maybe it could change in the future.
31
+ I'm quite impressed with its ability to process a non-English text at 32k context with usable results with my 12GB GPU, with 8bpw precision at that.
32
 
33
  ## How to run
34