RonanMcGovern
commited on
Commit
•
974168a
1
Parent(s):
9f9b784
Add 13B GPTQ link
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ tags:
|
|
22 |
|
23 |
Available models:
|
24 |
- fLlama-7B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling)), ([GGML](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GGML)), ([GPTQ](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GPTQ)) - free
|
25 |
-
- fLlama-13B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling)) - paid
|
26 |
|
27 |
## Inference with Google Colab and HuggingFace 🤗
|
28 |
|
@@ -41,7 +41,7 @@ To run this you'll need to install llamaccp from ggerganov on github.
|
|
41 |
```
|
42 |
./server -m fLlama-2-7b-chat.ggmlv3.q3_K_M.bin -ngl 32 -c 2048
|
43 |
```
|
44 |
-
|
45 |
|
46 |
## Licensing and Usage
|
47 |
|
|
|
22 |
|
23 |
Available models:
|
24 |
- fLlama-7B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling)), ([GGML](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GGML)), ([GPTQ](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-GPTQ)) - free
|
25 |
+
- fLlama-13B ([bitsandbytes NF4](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling)), ([GPTQ](https://huggingface.co/Trelis/Llama-2-13b-chat-hf-function-calling-GPTQ)) - paid
|
26 |
|
27 |
## Inference with Google Colab and HuggingFace 🤗
|
28 |
|
|
|
41 |
```
|
42 |
./server -m fLlama-2-7b-chat.ggmlv3.q3_K_M.bin -ngl 32 -c 2048
|
43 |
```
|
44 |
+
which will allow you to run a chatbot in your browser. The -ngl offloads layers to the Mac's GPU and gets very good token generation speed.
|
45 |
|
46 |
## Licensing and Usage
|
47 |
|