Files changed (1) hide show
  1. README.md +48 -8
README.md CHANGED
@@ -13,7 +13,6 @@ tags:
13
  - gguf
14
  - llama cpp
15
  ---
16
-
17
  # Octopus V4-GGUF: Graph of language models
18
 
19
 
@@ -32,22 +31,63 @@ tags:
32
  **Acknowledgement**:
33
  We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
34
 
 
35
 
36
- ## Run with [Ollama](https://github.com/ollama/ollama)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ```bash
39
- ollama run NexaAIDev/octopus-v4-q4_k_m
40
  ```
41
 
42
- Input example:
 
 
 
43
 
44
- ```json
45
- Query: Tell me the result of derivative of x^3 when x is 2?
 
 
 
46
 
47
- Response: <nexa_4> ('Determine the derivative of the function f(x) = x^3 at the point where x equals 2, and interpret the result within the context of rate of change and tangent slope.')<nexa_end>
 
 
 
 
 
 
48
 
 
 
 
 
 
 
 
 
49
  ```
50
- Note that `<nexa_4>` represents the math gpt.
51
 
52
  ### Dataset and Benchmark
53
 
 
13
  - gguf
14
  - llama cpp
15
  ---
 
16
  # Octopus V4-GGUF: Graph of language models
17
 
18
 
 
31
  **Acknowledgement**:
32
  We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
33
 
34
+ ## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
35
 
36
+ 1. **Clone and compile:**
37
+
38
+ ```bash
39
+ git clone https://github.com/ggerganov/llama.cpp
40
+ cd llama.cpp
41
+ # Compile the source code:
42
+ make
43
+ ```
44
+
45
+ 2. **Prepare the Input Prompt File:**
46
+
47
+ Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
48
+
49
+ `chat-with-octopus.txt`:
50
+
51
+ ```bash
52
+ User:
53
+ ```
54
+
55
+ 3. **Execute the Model:**
56
+
57
+ Run the following command in the terminal:
58
 
59
  ```bash
60
+ ./main -m ./path/to/octopus-v4-Q4_K_M.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
61
  ```
62
 
63
+ Example prompt to interact
64
+ ```bash
65
+ <|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
66
+ ```
67
 
68
+ ## Run with [Ollama](https://github.com/ollama/ollama)
69
+ 1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
70
+ ```bash
71
+ FROM ./path/to/octopus-v4-Q4_K_M.gguf
72
+ ```
73
 
74
+ 2. Use the following command to add the model to Ollama:
75
+ ```bash
76
+ ollama create octopus-v4-Q4_K_M -f Modelfile
77
+ PARAMETER temperature 0
78
+ PARAMETER num_ctx 1024
79
+ PARAMETER stop <nexa_end>
80
+ ```
81
 
82
+ 3. Verify that the model has been successfully imported:
83
+ ```bash
84
+ ollama ls
85
+ ```
86
+
87
+ ### Run the model
88
+ ```bash
89
+ ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
90
  ```
 
91
 
92
  ### Dataset and Benchmark
93