Update README.md
Browse files
README.md
CHANGED
@@ -32,6 +32,22 @@ python3 main.py \
|
|
32 |
|
33 |
|
34 |
### Use the model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
### INT4 Inference with AutoGPTQ
|
37 |
|
|
|
32 |
|
33 |
|
34 |
### Use the model
|
35 |
+
### INT4 Inference with ITREX on CPU
|
36 |
+
Install the latest [intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers)
|
37 |
+
```python
|
38 |
+
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
|
39 |
+
from transformers import AutoTokenizer
|
40 |
+
quantized_model_dir = "Intel/Mistral-7B-v0.1-int4-inc"
|
41 |
+
model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
|
42 |
+
device_map="auto",
|
43 |
+
trust_remote_code=False,
|
44 |
+
use_neural_speed=False,
|
45 |
+
)
|
46 |
+
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)
|
47 |
+
print(tokenizer.decode(model.generate(**tokenizer("There is a girl who likes adventure,", return_tensors="pt").to(model.device),max_new_tokens=50)[0]))
|
48 |
+
## <s> There is a girl who likes adventure, and she is a little bit crazy. She is a little bit crazy because she likes to do things that are dangerous. She likes to climb mountains, and she likes to go on long hikes. She also likes to go on long bike rides
|
49 |
+
```
|
50 |
+
|
51 |
|
52 |
### INT4 Inference with AutoGPTQ
|
53 |
|