derek-thomas
/

tgi-benchmark-notebooks

Model card Files Files and versions Community

derek-thomas HF staff commited on May 3

Commit

4c7cdda

•

1 Parent(s): 3aaece5

Explaining setup more

Browse files

Files changed (2) hide show

TGI-benchmark.ipynb +45 -7
TGI-launcher.ipynb +0 -0

TGI-benchmark.ipynb CHANGED Viewed

@@ -1,5 +1,33 @@
 {
  "cells": [
   {
    "cell_type": "code",
    "execution_count": 1,
@@ -63,13 +91,14 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c8afc9d5-f624-4d7f-a64f-08af02a4aaff",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "!text-generation-benchmark \\\n",
     "--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n",
     "--sequence-length 3000 \\\n",
     "--decode-length 300 \\\n",
@@ -77,8 +106,17 @@
     "--batch-size 2 \\\n",
     "--batch-size 3 \\\n",
     "--batch-size 4 \\\n",
-    "--batch-size 5"
    ]
   }
  ],
  "metadata": {
@@ -97,7 +135,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.5"
   }
  },
  "nbformat": 4,

 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "28e3460d-59b1-4d4c-b62e-510987fb2f28",
+   "metadata": {},
+   "source": [
+    "# Introduction\n",
+    "This notebook is to show how to launch the TGI Benchmark tool. \n",
+    "\n",
+    "## Warning\n",
+    "Please note that the TGI Benchmark tool is designed to work in a terminal, not a jupyter notebook. This means you will need to copy/paste the command in a jupyter terminal tab. I am putting them here for convenience."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c0de3cc9-c6cd-45b3-9dd0-84b3cb2fc8b2",
+   "metadata": {},
+   "source": [
+    "Here we can see the different settings for TGI Benchmark. \n",
+    "\n",
+    "Here are some of the most important ones:\n",
+    "\n",
+    "- `--tokenizer-name` This is required so the tool knows what tokenizer to use\n",
+    "- `--batch-size` This is important for load testing. We should use more and more values to see what happens to throughput and latency\n",
+    "- `--sequence-length` AKA input tokens, it is important to match your use-case needs\n",
+    "- `--decode-length` AKA output tokens, it is important to match your use-case needs\n",
+    "- `--runs` Use fewer when you are exploring, but the default 10 is good for your final estimate"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 1,
    ]
   },
   {
+   "cell_type": "markdown",
+   "id": "42d9561b-1aea-4c8c-9fe8-e36af43482fe",
    "metadata": {},
    "source": [
+    "Here is an example command. Notice that I add the batch sizes of interest repeatedly to make sure all of them are used by the benchmark tool.\n",
+    "```bash\n",
+    "\n",
+    "text-generation-benchmark \\\n",
     "--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n",
     "--sequence-length 3000 \\\n",
     "--decode-length 300 \\\n",
     "--batch-size 2 \\\n",
     "--batch-size 3 \\\n",
     "--batch-size 4 \\\n",
+    "--batch-size 5\n",
+    "```"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c821081f-ff78-4046-88a6-3652f7322b85",
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
+   "version": "3.10.8"
   }
  },
  "nbformat": 4,

TGI-launcher.ipynb CHANGED Viewed

The diff for this file is too large to render. See raw diff