diff --git a/docs/source/commands/trtllm-bench.rst b/docs/source/commands/trtllm-bench.rst new file mode 100644 index 00000000000..7f03c8dfc66 --- /dev/null +++ b/docs/source/commands/trtllm-bench.rst @@ -0,0 +1,164 @@ +trtllm-bench +=========================== + +trtllm-bench is a comprehensive benchmarking tool for TensorRT-LLM engines. It provides three main subcommands for different benchmarking scenarios: + +**Common Options for All Commands:** + +**Usage:** + +.. click:: tensorrt_llm.commands.bench:main + :prog: trtllm-bench + :nested: full + :commands: throughput, latency, build + + + +prepare_dataset.py +=========================== + +trtllm-bench is designed to work with the `prepare_dataset.py `_ script, which generates benchmark datasets in the required format. The prepare_dataset script supports: + +**Dataset Types:** + +- Real datasets from various sources +- Synthetic datasets with normal or uniform token distributions +- LoRA task-specific datasets + +**Key Features:** + +- Tokenizer integration for proper text preprocessing +- Configurable random seeds for reproducible results +- Support for LoRA adapters and task IDs +- Output in JSON format compatible with trtllm-bench + +.. important:: + The ``--stdout`` flag is **required** when using prepare_dataset.py with trtllm-bench to ensure proper data streaming format. + +**Usage:** + +prepare_dataset +------------------- + +.. code-block:: bash + + python prepare_dataset.py [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--tokenizer`` + - Tokenizer directory or HuggingFace model name (required) + * - ``--output`` + - Output JSON filename (default: preprocessed_dataset.json) + * - ``--stdout`` + - Print output to stdout with JSON dataset entry on each line (**required for trtllm-bench**) + * - ``--random-seed`` + - Random seed for token generation (default: 420) + * - ``--task-id`` + - LoRA task ID (default: -1) + * - ``--rand-task-id`` + - Random LoRA task range (two integers) + * - ``--lora-dir`` + - Directory containing LoRA adapters + * - ``--log-level`` + - Logging level: info or debug (default: info) + +dataset +------------------- + +Process real datasets from various sources. + +.. code-block:: bash + + python prepare_dataset.py dataset [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--input`` + - Input dataset file or directory (required) + * - ``--max-input-length`` + - Maximum input sequence length (default: 2048) + * - ``--max-output-length`` + - Maximum output sequence length (default: 512) + * - ``--num-samples`` + - Number of samples to process (default: all) + * - ``--format`` + - Input format: json, jsonl, csv, or txt (default: auto-detect) + + +token_norm_dist +------------------- + +Generate synthetic datasets with normal token distribution. + +.. code-block:: bash + + python prepare_dataset.py token_norm_dist [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--num-requests`` + - Number of requests to be generated (required) + * - ``--input-mean`` + - Normal distribution mean for input tokens (required) + * - ``--input-stdev`` + - Normal distribution standard deviation for input tokens (required) + * - ``--output-mean`` + - Normal distribution mean for output tokens (required) + * - ``--output-stdev`` + - Normal distribution standard deviation for output tokens (required) + + +token_unif_dist +------------------- + +Generate synthetic datasets with uniform token distribution + +.. code-block:: bash + + python prepare_dataset.py token_unif_dist [OPTIONS] + +**Options** + +---- + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Option + - Description + * - ``--num-requests`` + - Number of requests to be generated (required) + * - ``--input-min`` + - Uniform distribution minimum for input tokens (required) + * - ``--input-max`` + - Uniform distribution maximum for input tokens (required) + * - ``--output-min`` + - Uniform distribution minimum for output tokens (required) + * - ``--output-max`` + - Uniform distribution maximum for output tokens (required) diff --git a/docs/source/index.rst b/docs/source/index.rst index b63ec95a676..50b9c122678 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -77,6 +77,7 @@ Welcome to TensorRT-LLM's Documentation! :caption: Command-Line Reference :hidden: + commands/trtllm-bench commands/trtllm-build commands/trtllm-serve