Commit
•
4c7cdda
1
Parent(s):
3aaece5
Explaining setup more
Browse files- TGI-benchmark.ipynb +45 -7
- TGI-launcher.ipynb +0 -0
TGI-benchmark.ipynb
CHANGED
@@ -1,5 +1,33 @@
|
|
1 |
{
|
2 |
"cells": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
{
|
4 |
"cell_type": "code",
|
5 |
"execution_count": 1,
|
@@ -63,13 +91,14 @@
|
|
63 |
]
|
64 |
},
|
65 |
{
|
66 |
-
"cell_type": "
|
67 |
-
"
|
68 |
-
"id": "c8afc9d5-f624-4d7f-a64f-08af02a4aaff",
|
69 |
"metadata": {},
|
70 |
-
"outputs": [],
|
71 |
"source": [
|
72 |
-
"
|
|
|
|
|
|
|
73 |
"--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n",
|
74 |
"--sequence-length 3000 \\\n",
|
75 |
"--decode-length 300 \\\n",
|
@@ -77,8 +106,17 @@
|
|
77 |
"--batch-size 2 \\\n",
|
78 |
"--batch-size 3 \\\n",
|
79 |
"--batch-size 4 \\\n",
|
80 |
-
"--batch-size 5"
|
|
|
81 |
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
82 |
}
|
83 |
],
|
84 |
"metadata": {
|
@@ -97,7 +135,7 @@
|
|
97 |
"name": "python",
|
98 |
"nbconvert_exporter": "python",
|
99 |
"pygments_lexer": "ipython3",
|
100 |
-
"version": "3.
|
101 |
}
|
102 |
},
|
103 |
"nbformat": 4,
|
|
|
1 |
{
|
2 |
"cells": [
|
3 |
+
{
|
4 |
+
"cell_type": "markdown",
|
5 |
+
"id": "28e3460d-59b1-4d4c-b62e-510987fb2f28",
|
6 |
+
"metadata": {},
|
7 |
+
"source": [
|
8 |
+
"# Introduction\n",
|
9 |
+
"This notebook is to show how to launch the TGI Benchmark tool. \n",
|
10 |
+
"\n",
|
11 |
+
"## Warning\n",
|
12 |
+
"Please note that the TGI Benchmark tool is designed to work in a terminal, not a jupyter notebook. This means you will need to copy/paste the command in a jupyter terminal tab. I am putting them here for convenience."
|
13 |
+
]
|
14 |
+
},
|
15 |
+
{
|
16 |
+
"cell_type": "markdown",
|
17 |
+
"id": "c0de3cc9-c6cd-45b3-9dd0-84b3cb2fc8b2",
|
18 |
+
"metadata": {},
|
19 |
+
"source": [
|
20 |
+
"Here we can see the different settings for TGI Benchmark. \n",
|
21 |
+
"\n",
|
22 |
+
"Here are some of the most important ones:\n",
|
23 |
+
"\n",
|
24 |
+
"- `--tokenizer-name` This is required so the tool knows what tokenizer to use\n",
|
25 |
+
"- `--batch-size` This is important for load testing. We should use more and more values to see what happens to throughput and latency\n",
|
26 |
+
"- `--sequence-length` AKA input tokens, it is important to match your use-case needs\n",
|
27 |
+
"- `--decode-length` AKA output tokens, it is important to match your use-case needs\n",
|
28 |
+
"- `--runs` Use fewer when you are exploring, but the default 10 is good for your final estimate"
|
29 |
+
]
|
30 |
+
},
|
31 |
{
|
32 |
"cell_type": "code",
|
33 |
"execution_count": 1,
|
|
|
91 |
]
|
92 |
},
|
93 |
{
|
94 |
+
"cell_type": "markdown",
|
95 |
+
"id": "42d9561b-1aea-4c8c-9fe8-e36af43482fe",
|
|
|
96 |
"metadata": {},
|
|
|
97 |
"source": [
|
98 |
+
"Here is an example command. Notice that I add the batch sizes of interest repeatedly to make sure all of them are used by the benchmark tool.\n",
|
99 |
+
"```bash\n",
|
100 |
+
"\n",
|
101 |
+
"text-generation-benchmark \\\n",
|
102 |
"--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n",
|
103 |
"--sequence-length 3000 \\\n",
|
104 |
"--decode-length 300 \\\n",
|
|
|
106 |
"--batch-size 2 \\\n",
|
107 |
"--batch-size 3 \\\n",
|
108 |
"--batch-size 4 \\\n",
|
109 |
+
"--batch-size 5\n",
|
110 |
+
"```"
|
111 |
]
|
112 |
+
},
|
113 |
+
{
|
114 |
+
"cell_type": "code",
|
115 |
+
"execution_count": null,
|
116 |
+
"id": "c821081f-ff78-4046-88a6-3652f7322b85",
|
117 |
+
"metadata": {},
|
118 |
+
"outputs": [],
|
119 |
+
"source": []
|
120 |
}
|
121 |
],
|
122 |
"metadata": {
|
|
|
135 |
"name": "python",
|
136 |
"nbconvert_exporter": "python",
|
137 |
"pygments_lexer": "ipython3",
|
138 |
+
"version": "3.10.8"
|
139 |
}
|
140 |
},
|
141 |
"nbformat": 4,
|
TGI-launcher.ipynb
CHANGED
The diff for this file is too large to render.
See raw diff
|
|