-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
[Log] Optimize startup log #28948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Log] Optimize startup log #28948
Conversation
Signed-off-by: yewentao256 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request optimizes startup logs to reduce duplication when running in a distributed environment (data parallelism / tensor parallelism). The changes correctly use logger.info_once and logger.warning_once with appropriate scopes (global or local) to ensure that messages are logged only once, or once per node, as intended. The implementation is correct and improves the logging behavior. For consistency and further log reduction, you might consider applying this pattern to other logger.info and logger.warning calls that are executed by multiple processes during startup.
Signed-off-by: yewentao256 <[email protected]>
njhill
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yewentao256!
| envs.VLLM_TORCH_PROFILER_WITH_STACK, | ||
| envs.VLLM_TORCH_PROFILER_WITH_FLOPS, | ||
| ) | ||
| if getattr(self.parallel_config, "data_parallel_rank", 0) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to use get_attr here since it should always be an attribute.
Also we should maybe use data_parallel_rank_local instead so that this is logged in each node?
e.g.
| if getattr(self.parallel_config, "data_parallel_rank", 0) == 0: | |
| if self.parallel_config.data_parallel_rank_local in (None, 0): |
Purpose
Reduce duplicate logs when using dp/tp