-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix: [nvbug5300494] Use runtime total gpu memory to calculate kv cache memory and log more memory information #4660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/bot run |
1 similar comment
|
/bot run |
|
PR_Github #6467 [ run ] triggered by Bot |
|
PR_Github #6467 [ run ] completed with state |
|
/bot run |
|
PR_Github #6510 [ run ] triggered by Bot |
|
/bot run --disable-fail-fast |
|
PR_Github #6514 [ run ] triggered by Bot |
|
PR_Github #6510 [ run ] completed with state |
|
PR_Github #6514 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #6586 [ run ] triggered by Bot |
|
PR_Github #6586 [ run ] completed with state |
|
/bot run --stage-list="H100_PCIe-PyTorch-3" |
|
PR_Github #6651 [ run ] triggered by Bot |
|
PR_Github #6651 [ run ] completed with state |
Change method to compute peak memory Set new peak memory for case test_ptq_quickstart_advanced_mtp Get non-torch memory of starttime of kv memory estimation Signed-off-by: Hui Gao <[email protected]>
|
/bot skip --comment="CI has passed." |
|
PR_Github #6687 [ skip ] triggered by Bot |
|
PR_Github #6687 [ skip ] completed with state |
…e memory and log more memory information (NVIDIA#4660) Signed-off-by: Hui Gao <[email protected]>
…e memory and log more memory information (NVIDIA#4660) Signed-off-by: Hui Gao <[email protected]>
Use runtime total gpu memory to calculate kv cache memory and log more memory information.
This can avoid mis-reporting of less kv memory.