PrunaAI/huggyllama-llama-7b-smashed-384795 · Upload folder using huggingface

Files changed (11) hide show

README.md +59 -0
config.json +1 -0
model/config.json +6 -0
model/model.bin +3 -0
model/smasher_config.json +1 -0
model/special_tokens_map.json +23 -0
model/tokenizer.json +0 -0
model/tokenizer.model +3 -0
model/tokenizer_config.json +38 -0
model/vocabulary.json +0 -0
plots.png +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,59 @@

+---
+license: apache-2.0
+library_name: pruna-engine
+thumbnail: "https://assets-global.website-files.com/646b351987a8d8ce158d1940/64ec9e96b4334c0e1ac41504_Logo%20with%20white%20text.svg"
+metrics:
+- memory_disk
+- memory_inference
+- inference_latency
+- inference_throughput
+- inference_CO2_emissions
+- inference_energy_consumption
+---
+<!-- header start -->
+<!-- 200823 -->
+<div style="width: auto; margin-left: auto; margin-right: auto">
+    <a href="https://www.pruna.ai/" target="_blank" rel="noopener noreferrer">
+        <img src="https://i.imgur.com/eDAlcgk.png" alt="PrunaAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
+    </a>
+</div>
+<!-- header end -->
+# Simply make AI models cheaper, smaller, faster, and greener!
+## Results
+![image info](./plots.png)
+## Setup
+You can run the smashed model by:
+1. Installing and importing the `pruna-engine` (version 0.2.6) package. Use `pip install pruna --extra-index-url https://pypi.nvidia.com --extra-index-url https://pypi.ngc.nvidia.com` for installation. See [Pypi](https://pypi.org/project/pruna-engine/) for detailed on the package.
+2. Downloading the model files at `model_path`. This can be done using huggingface with this repository name or with manual downloading.
+3. Loading the model
+4. Running the model.
+You can achieve this by running the following code:
+```python
+from transformers.utils.hub import cached_file
+from pruna_engine.PrunaModel import PrunaModel  # Step (1): install and import `pruna-engine` package.
+...
+model_path = cached_file("PrunaAI/REPO", "model")  # Step (2): download the model files at `model_path`.
+smashed_model = PrunaModel.load_model(model_path)  # Step (3): load the model.
+y = smashed_model(x)  # Step (4): run the model.
+```
+## Configurations
+The configuration info are in `config.json`.
+## License
+We follow the same license as the original model. Please check the license of the original model before using this model.
+## Want to compress other models?
+- Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
+- Request access to easily compress your own AI models [here](https://z0halsaff74.typeform.com/pruna-access?typeform-source=www.pruna.ai).

config.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"pruner": "None", "pruning_ratio": 0.0, "factorizer": "None", "quantizer": "None", "n_quantization_bits": 8, "output_deviation": 0.005, "compiler": "ctranslate2_generation", "static_batch": true, "static_shape": true, "controlnet": "None", "unet_dim": 4, "device": "cuda", "cache_dir": "/ceph/hdd/staff/charpent/.cache/models", "max_batch_size": 1, "image_height": "None", "image_width": "None", "version": "None", "tokenizer_name": "placeholder"}

model/config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "bos_token": "<s>",
+  "eos_token": "</s>",
+  "layer_norm_epsilon": 1e-06,
+  "unk_token": "<unk>"
+}

model/model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:eeadbbd18b4f8fdd457e5a4b7f67a178b18f8c23fed5f3442b4b6e25d7d65d83
+size 6744405833

model/smasher_config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"load_function": "ctranslate2", "api_key": "pruna_c4c77860c62a2965f6bc281841ee1d7bd3", "verify_url": "http://johnrachwan.pythonanywhere.com", "model_specific": {}}

model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

model/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

model/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

model/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "legacy": false,
+  "model_max_length": 2048,
+  "pad_token": null,
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}

model/vocabulary.json ADDED Viewed

The diff for this file is too large to render. See raw diff

plots.png ADDED Viewed

PrunaAI
/

huggyllama-llama-7b-smashed-384795

Upload folder using huggingface_hub