Ranjanunicode
commited on
Commit
•
7bfe306
1
Parent(s):
62efeb6
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ base_model:
|
|
8 |
- meta-llama/Llama-2-7b-chat-hf
|
9 |
---
|
10 |
|
11 |
-
# Q-int4 unicode-llama-2-chat-Hf-q4-
|
12 |
- A condensed edition of Llama 2 chat hugging face, designed for deployment with minimal hardware specifications.
|
13 |
|
14 |
|
@@ -51,7 +51,7 @@ Output Models generate text only.
|
|
51 |
from ctransformers import AutoModelForCausalLM
|
52 |
|
53 |
#Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
|
54 |
-
llm = AutoModelForCausalLM.from_pretrained("Ranjanunicode/unicode-llama-2-chat-Hf-q4-
|
55 |
|
56 |
print(llm("AI is going to"))
|
57 |
|
|
|
8 |
- meta-llama/Llama-2-7b-chat-hf
|
9 |
---
|
10 |
|
11 |
+
# Q-int4 unicode-llama-2-chat-Hf-q4-gguf
|
12 |
- A condensed edition of Llama 2 chat hugging face, designed for deployment with minimal hardware specifications.
|
13 |
|
14 |
|
|
|
51 |
from ctransformers import AutoModelForCausalLM
|
52 |
|
53 |
#Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
|
54 |
+
llm = AutoModelForCausalLM.from_pretrained("Ranjanunicode/unicode-llama-2-chat-Hf-q4-gguf", model_file="unicode-llama-2-chat-Hf-q4-2.gguf", model_type="llama", gpu_layers=40)
|
55 |
|
56 |
print(llm("AI is going to"))
|
57 |
|