Intel
/

gpt-j-6B-pytorch-int8-static-inc

Text Generation

text-generation-inference

PostTrainingStatic

Intel® Neural Compressor

neural-compressor

Inference Endpoints

Model card Files Files and versions Community

Kaihui commited on Jan 18

Commit

0252b79

•

1 Parent(s): 2549f5d

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -29,6 +29,25 @@ This int8 PyTorch model is generated by [intel-extension-for-transformers](https
 | intel-extension-for-pytorch       | 2.1.0      |
 | transformers    | 4.32.0         |
 ## Evaluation results
 Evaluating the accuracy of the optimized model of gpt-j-6b using the lambada_openai dataset in lm_eval.

 | intel-extension-for-pytorch       | 2.1.0      |
 | transformers    | 4.32.0         |
+## Usage
+Currently, we only support the method of downloading the model and then loading it. In this approach, the model files are downloaded from the server and stored locally on the user's machine.
+- Clone this model repository
+```bash
+# Make sure you have git-lfs installed (https://git-lfs.com)
+git lfs install
+git clone https://huggingface.co/Intel/gpt-j-6B-pytorch-int8-static
+```
+- Load int8 model
+```python
+from intel_extension_for_transformers.llm.evaluation.models import TSModelCausalLMForITREX
+user_model = TSModelCausalLMForITREX.from_pretrained(
+    args.output_dir, # Your saved path
+    file_name="best_model.pt",
+    trust_remote_code=args.trust_remote_code, # Default is False
+)
+```
 ## Evaluation results
 Evaluating the accuracy of the optimized model of gpt-j-6b using the lambada_openai dataset in lm_eval.