derek-thomas HF staff commited on
Commit
4c7cdda
1 Parent(s): 3aaece5

Explaining setup more

Browse files
Files changed (2) hide show
  1. TGI-benchmark.ipynb +45 -7
  2. TGI-launcher.ipynb +0 -0
TGI-benchmark.ipynb CHANGED
@@ -1,5 +1,33 @@
1
  {
2
  "cells": [
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  {
4
  "cell_type": "code",
5
  "execution_count": 1,
@@ -63,13 +91,14 @@
63
  ]
64
  },
65
  {
66
- "cell_type": "code",
67
- "execution_count": null,
68
- "id": "c8afc9d5-f624-4d7f-a64f-08af02a4aaff",
69
  "metadata": {},
70
- "outputs": [],
71
  "source": [
72
- "!text-generation-benchmark \\\n",
 
 
 
73
  "--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n",
74
  "--sequence-length 3000 \\\n",
75
  "--decode-length 300 \\\n",
@@ -77,8 +106,17 @@
77
  "--batch-size 2 \\\n",
78
  "--batch-size 3 \\\n",
79
  "--batch-size 4 \\\n",
80
- "--batch-size 5"
 
81
  ]
 
 
 
 
 
 
 
 
82
  }
83
  ],
84
  "metadata": {
@@ -97,7 +135,7 @@
97
  "name": "python",
98
  "nbconvert_exporter": "python",
99
  "pygments_lexer": "ipython3",
100
- "version": "3.9.5"
101
  }
102
  },
103
  "nbformat": 4,
 
1
  {
2
  "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "id": "28e3460d-59b1-4d4c-b62e-510987fb2f28",
6
+ "metadata": {},
7
+ "source": [
8
+ "# Introduction\n",
9
+ "This notebook is to show how to launch the TGI Benchmark tool. \n",
10
+ "\n",
11
+ "## Warning\n",
12
+ "Please note that the TGI Benchmark tool is designed to work in a terminal, not a jupyter notebook. This means you will need to copy/paste the command in a jupyter terminal tab. I am putting them here for convenience."
13
+ ]
14
+ },
15
+ {
16
+ "cell_type": "markdown",
17
+ "id": "c0de3cc9-c6cd-45b3-9dd0-84b3cb2fc8b2",
18
+ "metadata": {},
19
+ "source": [
20
+ "Here we can see the different settings for TGI Benchmark. \n",
21
+ "\n",
22
+ "Here are some of the most important ones:\n",
23
+ "\n",
24
+ "- `--tokenizer-name` This is required so the tool knows what tokenizer to use\n",
25
+ "- `--batch-size` This is important for load testing. We should use more and more values to see what happens to throughput and latency\n",
26
+ "- `--sequence-length` AKA input tokens, it is important to match your use-case needs\n",
27
+ "- `--decode-length` AKA output tokens, it is important to match your use-case needs\n",
28
+ "- `--runs` Use fewer when you are exploring, but the default 10 is good for your final estimate"
29
+ ]
30
+ },
31
  {
32
  "cell_type": "code",
33
  "execution_count": 1,
 
91
  ]
92
  },
93
  {
94
+ "cell_type": "markdown",
95
+ "id": "42d9561b-1aea-4c8c-9fe8-e36af43482fe",
 
96
  "metadata": {},
 
97
  "source": [
98
+ "Here is an example command. Notice that I add the batch sizes of interest repeatedly to make sure all of them are used by the benchmark tool.\n",
99
+ "```bash\n",
100
+ "\n",
101
+ "text-generation-benchmark \\\n",
102
  "--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n",
103
  "--sequence-length 3000 \\\n",
104
  "--decode-length 300 \\\n",
 
106
  "--batch-size 2 \\\n",
107
  "--batch-size 3 \\\n",
108
  "--batch-size 4 \\\n",
109
+ "--batch-size 5\n",
110
+ "```"
111
  ]
112
+ },
113
+ {
114
+ "cell_type": "code",
115
+ "execution_count": null,
116
+ "id": "c821081f-ff78-4046-88a6-3652f7322b85",
117
+ "metadata": {},
118
+ "outputs": [],
119
+ "source": []
120
  }
121
  ],
122
  "metadata": {
 
135
  "name": "python",
136
  "nbconvert_exporter": "python",
137
  "pygments_lexer": "ipython3",
138
+ "version": "3.10.8"
139
  }
140
  },
141
  "nbformat": 4,
TGI-launcher.ipynb CHANGED
The diff for this file is too large to render. See raw diff