Skip to content

Conversation

@markmc
Copy link
Member

@markmc markmc commented Sep 2, 2025

As per #24015, what we currently call as TPOT should instead be called ITL since what we are actually measuring is the time between iterations, and a single iteration can produce multiple tokens.

I'm flagging the TPOT metric as deprecated from 0.11 - even if this gets released on a 0.10.x release, I think we should only start the deprecation period from when it gets released in a new minor 0.N.0 release.

The only case where we don't want to assert the existance
of a metric is where it is deprecated and we're not showing
hidden deprecated metrics.

Signed-off-by: Mark McLoughlin <[email protected]>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly deprecates the vllm:time_per_output_token_seconds (TPOT) metric in favor of the more accurately named vllm:inter_token_latency_seconds (ITL). The changes are consistently applied across the codebase, including metrics definitions, logging, tests, and the Grafana dashboard example. The deprecation strategy of retaining the old metric for backward compatibility while introducing the new one is sound. I've found one minor issue with the documentation of the new metric, which appears to be a copy-paste error.

As per vllm-project#24015, what we currently call as TPOT should instead be called
ITL since what we are actually measuring is the time between
iterations, and a single iteration can produce multiple tokens.

Signed-off-by: Mark McLoughlin <[email protected]>
@markmc markmc force-pushed the metrics-rename-tpot-to-itl branch from b176439 to 09dbc43 Compare September 2, 2025 15:42
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for updating

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) September 2, 2025 15:48
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025
@DarkLight1337 DarkLight1337 merged commit 2417798 into vllm-project:main Sep 2, 2025
44 of 46 checks passed
845473182 pushed a commit to 845473182/vllm that referenced this pull request Sep 3, 2025
* 'main' of https://github.com/845473182/vllm: (457 commits)
  [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132)
  [Misc] Add check for dual_chunk_attention (vllm-project#24070)
  [Doc]: fix typos in Python comments (vllm-project#24115)
  [Doc]: fix typos in Python comments (vllm-project#24093)
  [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660)
  fix some typos (vllm-project#24071)
  [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656)
  Upgrade xgrammar to 0.1.23 (vllm-project#22988)
  Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073)
  [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081)
  [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121)
  [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119)
  [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692)
  [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936)
  [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370)
  Fix weights loading for Apertus (vllm-project#24100)
  [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)
  [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902)
  Run ruff format on a few files. (vllm-project#24075)
  [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945)
  ...
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants