TheBloke commited on
Commit
e9c4cf6
1 Parent(s): b101438

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -150,7 +150,7 @@ The following clients/libraries will automatically download models for you, prov
150
 
151
  ### In `text-generation-webui`
152
 
153
- Under Download Model, you can enter the model repo: TheBloke/hippogriff-30b-chat-GGUF and below it, a specific filename to download, such as: hippogriff-30b.q4_K_M.gguf.
154
 
155
  Then click Download.
156
 
@@ -159,13 +159,13 @@ Then click Download.
159
  I recommend using the `huggingface-hub` Python library:
160
 
161
  ```shell
162
- pip3 install huggingface-hub>=0.17.1
163
  ```
164
 
165
  Then you can download any individual model file to the current directory, at high speed, with a command like this:
166
 
167
  ```shell
168
- huggingface-cli download TheBloke/hippogriff-30b-chat-GGUF hippogriff-30b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
169
  ```
170
 
171
  <details>
@@ -188,10 +188,10 @@ pip3 install hf_transfer
188
  And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
189
 
190
  ```shell
191
- HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/hippogriff-30b-chat-GGUF hippogriff-30b.q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
192
  ```
193
 
194
- Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running the download command.
195
  </details>
196
  <!-- README_GGUF.md-how-to-download end -->
197
 
@@ -201,7 +201,7 @@ Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running
201
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
202
 
203
  ```shell
204
- ./main -ngl 32 -m hippogriff-30b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:"
205
  ```
206
 
207
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
@@ -241,7 +241,7 @@ CT_METAL=1 pip install ctransformers>=0.2.24 --no-binary ctransformers
241
  from ctransformers import AutoModelForCausalLM
242
 
243
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
244
- llm = AutoModelForCausalLM.from_pretrained("TheBloke/hippogriff-30b-chat-GGUF", model_file="hippogriff-30b.q4_K_M.gguf", model_type="llama", gpu_layers=50)
245
 
246
  print(llm("AI is going to"))
247
  ```
 
150
 
151
  ### In `text-generation-webui`
152
 
153
+ Under Download Model, you can enter the model repo: TheBloke/hippogriff-30b-chat-GGUF and below it, a specific filename to download, such as: hippogriff-30b.Q4_K_M.gguf.
154
 
155
  Then click Download.
156
 
 
159
  I recommend using the `huggingface-hub` Python library:
160
 
161
  ```shell
162
+ pip3 install huggingface-hub
163
  ```
164
 
165
  Then you can download any individual model file to the current directory, at high speed, with a command like this:
166
 
167
  ```shell
168
+ huggingface-cli download TheBloke/hippogriff-30b-chat-GGUF hippogriff-30b.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
169
  ```
170
 
171
  <details>
 
188
  And set environment variable `HF_HUB_ENABLE_HF_TRANSFER` to `1`:
189
 
190
  ```shell
191
+ HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download TheBloke/hippogriff-30b-chat-GGUF hippogriff-30b.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
192
  ```
193
 
194
+ Windows Command Line users: You can set the environment variable by running `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before the download command.
195
  </details>
196
  <!-- README_GGUF.md-how-to-download end -->
197
 
 
201
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
202
 
203
  ```shell
204
+ ./main -ngl 32 -m hippogriff-30b.Q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:"
205
  ```
206
 
207
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
241
  from ctransformers import AutoModelForCausalLM
242
 
243
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
244
+ llm = AutoModelForCausalLM.from_pretrained("TheBloke/hippogriff-30b-chat-GGUF", model_file="hippogriff-30b.Q4_K_M.gguf", model_type="llama", gpu_layers=50)
245
 
246
  print(llm("AI is going to"))
247
  ```