{ "cells": [ { "cell_type": "markdown", "id": "28e3460d-59b1-4d4c-b62e-510987fb2f28", "metadata": {}, "source": [ "# Introduction\n", "This notebook is to show how to launch the TGI Benchmark tool. \n" ] }, { "cell_type": "markdown", "id": "c0de3cc9-c6cd-45b3-9dd0-84b3cb2fc8b2", "metadata": {}, "source": [ "Here we can see the different settings for TGI Benchmark. \n", "\n", "Here are some of the more important ones:\n", "\n", "- `--tokenizer-name` This is required so the tool knows what tokenizer to use\n", "- `--batch-size` This is important for load testing. We should use more and more values to see what happens to throughput and latency\n", "- `--sequence-length` AKA input tokens, it is important to match your use-case needs\n", "- `--decode-length` AKA output tokens, it is important to match your use-case needs\n", "- `--runs` 10 is the default\n", "\n", "
\n",
" 💡 Tip: Use a low number for --runs
when you are exploring but a higher number as you finalize to get more precise statistics\n",
"
\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "694df6d6-a521-4dab-977b-2828d4250781",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Text Generation Benchmarking tool\n",
"\n",
"\u001B[1m\u001B[4mUsage:\u001B[0m \u001B[1mtext-generation-benchmark\u001B[0m [OPTIONS] \u001B[1m--tokenizer-name\u001B[0m \n", " ⚠️ Warning: Please note that the TGI Benchmark tool is designed to work in a terminal, not a jupyter notebook. This means you will need to copy/paste the command in a jupyter terminal tab. I am putting them here for convenience.\n", "\n", "\n", "```bash\n", "text-generation-benchmark \\\n", "--tokenizer-name astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit \\\n", "--sequence-length 70 \\\n", "--decode-length 50 \\\n", "--batch-size 1 \\\n", "--batch-size 2 \\\n", "--batch-size 4 \\\n", "--batch-size 8 \\\n", "--batch-size 16 \\\n", "--batch-size 32 \\\n", "--batch-size 64 \\\n", "--batch-size 128 \n", "```\n", "\n", "Hit `q` to stop the tool." ] }, { "cell_type": "code", "execution_count": null, "id": "13ac475b-44e1-47e4-85ce-def2db6879c9", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 5 }