royleibov
/

granite-7b-instruct-ZipNN-Compressed

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

royleibov commited on Sep 15

Commit

7033f8d

•

1 Parent(s): ddd9df0

Add ZipNN text

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -46,6 +46,20 @@ zipnn_hf()
 tokenizer = AutoTokenizer.from_pretrained("royleibov/granite-7b-instruct-ZipNN-Compressed")
 model = AutoModelForCausalLM.from_pretrained("royleibov/granite-7b-instruct-ZipNN-Compressed")
 ```
 # Model Card for Granite-7b-lab [Paper](https://arxiv.org/abs/2403.01081)

 tokenizer = AutoTokenizer.from_pretrained("royleibov/granite-7b-instruct-ZipNN-Compressed")
 model = AutoModelForCausalLM.from_pretrained("royleibov/granite-7b-instruct-ZipNN-Compressed")
 ```
+### ZipNN
+ZipNN also allows you to seemlessly save local disk space in your cache after the model is downloaded.
+To compress the cached model, simply run:
+```bash
+python zipnn_compress_path.py safetensors --model royleibov/granite-7b-instruct-ZipNN-Compressed --hf_cache
+```
+The model will be decompressed automatically and safely as long as `zipnn_hf()` is added at the top of the file like in the [example above](#use-this-model).
+To decompress manualy, simply run:
+```bash
+python zipnn_decompress_path.py --model royleibov/granite-7b-instruct-ZipNN-Compressed --hf_cache
+```
 # Model Card for Granite-7b-lab [Paper](https://arxiv.org/abs/2403.01081)