TheBloke
/

upstage-llama-30b-instruct-2048-GGML

Text Generation

Model card Files Files and versions Community

TheBloke commited on Jul 22, 2023

Commit

098a39d

•

1 Parent(s): 4fa3808

Update README.md

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -46,10 +46,16 @@ Many thanks to William Beauchamp from [Chai](https://chai-research.com/) for pro
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/upstage-llama-30b-instruct-2048-GGML)
 * [Original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/upstage/llama-30b-instruct-2048)
-## Prompt template: Unknown
 ```
 {prompt}
 ```
 <!-- compatibility_ggml start -->
@@ -106,7 +112,7 @@ Refer to the Provided Files table below to see what files use which methods, and
 I use the following command line; adjust for your tastes and needs:
 ```
-./main -t 10 -ngl 32 -m upstage-llama-30b-instruct-2048.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"
 ```
 Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`.

 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/upstage-llama-30b-instruct-2048-GGML)
 * [Original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/upstage/llama-30b-instruct-2048)
+## Prompt template: Orca-Hashes
 ```
+### System:
+{System}
+### User:
 {prompt}
+### Assistant:
 ```
 <!-- compatibility_ggml start -->
 I use the following command line; adjust for your tastes and needs:
 ```
+./main -t 10 -ngl 32 -m upstage-llama-30b-instruct-2048.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### System: You are a helpful assistant\n### User: write a story about llamas\n### Assistant:"
 ```
 Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`.