[Metrics] Deprecate TPOT in favor of ITL #24110

markmc · 2025-09-02T15:35:43Z

As per #24015, what we currently call as TPOT should instead be called ITL since what we are actually measuring is the time between iterations, and a single iteration can produce multiple tokens.

I'm flagging the TPOT metric as deprecated from 0.11 - even if this gets released on a 0.10.x release, I think we should only start the deprecation period from when it gets released in a new minor 0.N.0 release.

The only case where we don't want to assert the existance of a metric is where it is deprecated and we're not showing hidden deprecated metrics. Signed-off-by: Mark McLoughlin <[email protected]>

gemini-code-assist

Code Review

This pull request correctly deprecates the vllm:time_per_output_token_seconds (TPOT) metric in favor of the more accurately named vllm:inter_token_latency_seconds (ITL). The changes are consistently applied across the codebase, including metrics definitions, logging, tests, and the Grafana dashboard example. The deprecation strategy of retaining the old metric for backward compatibility while introducing the new one is sound. I've found one minor issue with the documentation of the new metric, which appears to be a copy-paste error.

vllm/engine/metrics.py

As per vllm-project#24015, what we currently call as TPOT should instead be called ITL since what we are actually measuring is the time between iterations, and a single iteration can produce multiple tokens. Signed-off-by: Mark McLoughlin <[email protected]>

DarkLight1337

LGTM, thanks for updating

* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...

Signed-off-by: Mark McLoughlin <[email protected]>

[Metrics] Fix handling of deprecated metrics in openai test

b15e031

The only case where we don't want to assert the existance of a metric is where it is deprecated and we're not showing hidden deprecated metrics. Signed-off-by: Mark McLoughlin <[email protected]>

markmc requested review from DarkLight1337, WoosukKwon, aarnphm, alexm-redhat, comaniac, njhill, robertgshaw2-redhat, simon-mo, youkaichao, ywang96 and zhuohan123 as code owners September 2, 2025 15:35

markmc mentioned this pull request Sep 2, 2025

[V1][Metrics] Add per-request TPOT histogram #24015

Merged

5 tasks

mergify bot added documentation Improvements or additions to documentation v1 labels Sep 2, 2025

gemini-code-assist bot reviewed Sep 2, 2025

View reviewed changes

vllm/engine/metrics.py Outdated Show resolved Hide resolved

markmc force-pushed the metrics-rename-tpot-to-itl branch from b176439 to 09dbc43 Compare September 2, 2025 15:42

DarkLight1337 approved these changes Sep 2, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) September 2, 2025 15:48

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025

DarkLight1337 merged commit 2417798 into vllm-project:main Sep 2, 2025
44 of 46 checks passed

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

[Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)

01157ad

Signed-off-by: Mark McLoughlin <[email protected]>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)

246f432

Signed-off-by: Mark McLoughlin <[email protected]>

markmc mentioned this pull request Oct 7, 2025

[Bug][Spec Decode]: TPOT in prometheus is ITL in vllm serve #19776

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Metrics] Deprecate TPOT in favor of ITL #24110

[Metrics] Deprecate TPOT in favor of ITL #24110

markmc commented Sep 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

[Metrics] Deprecate TPOT in favor of ITL #24110

[Metrics] Deprecate TPOT in favor of ITL #24110

Conversation

markmc commented Sep 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markmc commented Sep 2, 2025 •

edited by github-actions bot

Loading