- 
                Notifications
    You must be signed in to change notification settings 
- Fork 528
[Bugfix][Worker] Clear NPU memory between test profiling #989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Thanks for this fixing! I think we can derectly move gc collection and peak mem status reset into  | 
| 
 I have assembled the 3 methods into  | 
| @wangxiyuan The CI is passed, and this PR is may needed for #969. | 
| 
 Why not just do it in https://github.com/vllm-project/vllm-ascend/pull/989/files#diff-efe99563375ba645b1c1befb237632c2c91f85b21d2ab2d1095eef82aecd3999L106 | 
| 
 Because maybe there are some other scenarios just need  | 
| please merge this pr and #969 into one | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| hope can merge, sleep mode v1 need this | 
| 
 I will rebase soon. | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
f56d491    to
    c509396      
    Compare
  
    | This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| # Profile the memory usage of the model and get the maximum number of | ||
| # cache blocks that can be allocated with the remaining free memory. | ||
| NPUPlatform.empty_cache() | ||
| clear_npu_memory() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between the two memory clear methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@momo609 clear_npu_memory() = gc.collect() + empty_cache() + reset_peak_memory_stats() (which will make the profile in dummy run more accurate).
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
| CC: @wangxiyuan | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
Signed-off-by: shen-shanshan <[email protected]>
Signed-off-by: shen-shanshan <[email protected]>
| Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@            Coverage Diff             @@
##             main     #989      +/-   ##
==========================================
- Coverage   27.39%   27.34%   -0.05%     
==========================================
  Files          56       56              
  Lines        6191     6183       -8     
==========================================
- Hits         1696     1691       -5     
+ Misses       4495     4492       -3     
 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| This pull request has conflicts, please resolve those before we can evaluate the pull request. | 
What this PR does / why we need it?
Clear NPU memory between test profiling.
If we don't clear the memory this way, the CI may get
OOMerror between UTs.This modification is adapted from
vllm/worker/worker.py, find more details at https://github.com/vllm-project/vllm/blob/main/vllm/worker/worker.py#L184-L186.Does this PR introduce any user-facing change?
How was this patch tested?