From 1a1c6c3cf7f8d0ed20d22584e1133f39b28d0fd0 Mon Sep 17 00:00:00 2001 From: Frank <3429989+FrankD412@users.noreply.github.com> Date: Wed, 16 Jul 2025 18:15:06 -0700 Subject: [PATCH] [TRTLLM-6070] docs: Add initial documentation for trtllm-bench CLI. (#5734) Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com> Signed-off-by: Frank <3429989+FrankD412@users.noreply.github.com> Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com> --- docs/source/1.0/cli-reference/trtllm-bench.md | 1 - .../source/1.0/cli-reference/trtllm-bench.rst | 164 ++++++++++++++++++ docs/source/index.rst | 2 +- 3 files changed, 165 insertions(+), 2 deletions(-) delete mode 100644 docs/source/1.0/cli-reference/trtllm-bench.md create mode 100644 docs/source/1.0/cli-reference/trtllm-bench.rst diff --git a/docs/source/1.0/cli-reference/trtllm-bench.md b/docs/source/1.0/cli-reference/trtllm-bench.md deleted file mode 100644 index 359a8025242..00000000000 --- a/docs/source/1.0/cli-reference/trtllm-bench.md +++ /dev/null @@ -1 +0,0 @@ -# trtllm-bench (trtllm-6070) diff --git a/docs/source/1.0/cli-reference/trtllm-bench.rst b/docs/source/1.0/cli-reference/trtllm-bench.rst new file mode 100644 index 00000000000..7f03c8dfc66 --- /dev/null +++ b/docs/source/1.0/cli-reference/trtllm-bench.rst @@ -0,0 +1,164 @@ +trtllm-bench +=========================== + +trtllm-bench is a comprehensive benchmarking tool for TensorRT-LLM engines. It provides three main subcommands for different benchmarking scenarios: + +**Common Options for All Commands:** + +**Usage:** + +.. click:: tensorrt_llm.commands.bench:main + :prog: trtllm-bench + :nested: full + :commands: throughput, latency, build + + + +prepare_dataset.py +=========================== + +trtllm-bench is designed to work with the `prepare_dataset.py `_ script, which generates benchmark datasets in the required format. The prepare_dataset script supports: + +**Dataset Types:** + +- Real datasets from various sources +- Synthetic datasets with normal or uniform token distributions +- LoRA task-specific datasets + +**Key Features:** + +- Tokenizer integration for proper text preprocessing +- Configurable random seeds for reproducible results +- Support for LoRA adapters and task IDs +- Output in JSON format compatible with trtllm-bench + +.. important:: + The ``--stdout`` flag is **required** when using prepare_dataset.py with trtllm-bench to ensure proper data streaming format. + +**Usage:** + +prepare_dataset +------------------- + +.. code-block:: bash + + python prepare_dataset.py [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--tokenizer`` + - Tokenizer directory or HuggingFace model name (required) + * - ``--output`` + - Output JSON filename (default: preprocessed_dataset.json) + * - ``--stdout`` + - Print output to stdout with JSON dataset entry on each line (**required for trtllm-bench**) + * - ``--random-seed`` + - Random seed for token generation (default: 420) + * - ``--task-id`` + - LoRA task ID (default: -1) + * - ``--rand-task-id`` + - Random LoRA task range (two integers) + * - ``--lora-dir`` + - Directory containing LoRA adapters + * - ``--log-level`` + - Logging level: info or debug (default: info) + +dataset +------------------- + +Process real datasets from various sources. + +.. code-block:: bash + + python prepare_dataset.py dataset [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--input`` + - Input dataset file or directory (required) + * - ``--max-input-length`` + - Maximum input sequence length (default: 2048) + * - ``--max-output-length`` + - Maximum output sequence length (default: 512) + * - ``--num-samples`` + - Number of samples to process (default: all) + * - ``--format`` + - Input format: json, jsonl, csv, or txt (default: auto-detect) + + +token_norm_dist +------------------- + +Generate synthetic datasets with normal token distribution. + +.. code-block:: bash + + python prepare_dataset.py token_norm_dist [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--num-requests`` + - Number of requests to be generated (required) + * - ``--input-mean`` + - Normal distribution mean for input tokens (required) + * - ``--input-stdev`` + - Normal distribution standard deviation for input tokens (required) + * - ``--output-mean`` + - Normal distribution mean for output tokens (required) + * - ``--output-stdev`` + - Normal distribution standard deviation for output tokens (required) + + +token_unif_dist +------------------- + +Generate synthetic datasets with uniform token distribution + +.. code-block:: bash + + python prepare_dataset.py token_unif_dist [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--num-requests`` + - Number of requests to be generated (required) + * - ``--input-min`` + - Uniform distribution minimum for input tokens (required) + * - ``--input-max`` + - Uniform distribution maximum for input tokens (required) + * - ``--output-min`` + - Uniform distribution minimum for output tokens (required) + * - ``--output-max`` + - Uniform distribution maximum for output tokens (required) diff --git a/docs/source/index.rst b/docs/source/index.rst index 2d435553d17..7bafedbbd21 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -37,7 +37,7 @@ Welcome to TensorRT-LLM's Documentation! :caption: CLI Reference 1.0/cli-reference/trtllm-serve.md - 1.0/cli-reference/trtllm-bench.md + 1.0/cli-reference/trtllm-bench.rst 1.0/cli-reference/trtllm-eval.md