Skip to content

Commit fff33e6

Browse files
committed
link to trtllm README
Signed-off-by: richardhuo-nv <[email protected]>
1 parent 742a0cb commit fff33e6

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

components/backends/trtllm/README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -303,3 +303,9 @@ sampling_params.logits_processor = create_trtllm_adapters(processors)
303303
## Performance Sweep
304304

305305
For detailed instructions on running comprehensive performance sweeps across both aggregated and disaggregated serving configurations, see the [TensorRT-LLM Benchmark Scripts for DeepSeek R1 model](./performance_sweeps/README.md). This guide covers recommended benchmarking setups, usage of provided scripts, and best practices for evaluating system performance.
306+
307+
## Dynamo KV Block Manager Integration
308+
309+
Dynamo with TensorRT-LLM currently supports integration with the Dynamo KV Block Manager. This integration can significantly reduce time-to-first-token (TTFT) latency, particularly in usage patterns such as multi-turn conversations and repeated long-context requests.
310+
311+
Here is the instruction: [Running KVBM in TensorRT-LLM](./../../../docs/guides/run_kvbm_in_trtllm.md) .

0 commit comments

Comments
 (0)