-
Notifications
You must be signed in to change notification settings - Fork 1.9k
test: waive failed cases #6394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: waive failed cases #6394
Conversation
📝 Walkthrough## Walkthrough
This update modifies a test in the Llama3 integration suite by adjusting cache and model parameters, and expands the integration test skip list by adding seven new test cases with references to known bugs. No changes were made to public interfaces or exported entities.
## Changes
| Cohort / File(s) | Change Summary |
|----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
| **Llama3 Integration Test Parameter Update**<br>`tests/integration/defs/accuracy/test_llm_api_pytorch.py` | Modified `test_fp8_tp4` in `TestLlama3_3_70BInstruct`: changed `free_gpu_memory_fraction` from 0.6 to 0.5, and added `max_seq_len=8192` and `max_batch_size=32` to the LLM constructor. |
| **Integration Test Skip List Expansion**<br>`tests/integration/test_lists/waives.txt` | Added seven new test cases to the skip list, each with a specific test identifier and NVIDIA bug tracking URL. No existing entries were changed. |
## Estimated code review effort
🎯 2 (Simple) | ⏱️ ~7 minutes
## Suggested reviewers
- crazydemo
- LarryXFly📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/integration/test_lists/waives.txt (1)
445-450: Minor consistency nit – align bug-URL style across new waives.Existing waives mix
https://nvbugs/…andhttps://nvbugspro.nvidia.com/bug/….
For the seven newly-added cases (lines 445-450) please follow the repo-wide convention (most recent entries use the fullnvbugspro.nvidia.com/bug/form). Keeping a single style avoids churn when scripts normalise or sort the list.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/integration/defs/accuracy/test_llm_api_pytorch.py(1 hunks)tests/integration/test_lists/waives.txt(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
tests/integration/test_lists/waives.txt (1)
Learnt from: yiqingy0
PR: #5198
File: jenkins/mergeWaiveList.py:0-0
Timestamp: 2025-07-22T08:33:49.109Z
Learning: In the TensorRT-LLM waive list merging system, removed lines are always located at the end of the merge waive lists, which is why the mergeWaiveList.py script uses reverse traversal - it's an optimization for this specific domain constraint.
🔇 Additional comments (3)
tests/integration/defs/accuracy/test_llm_api_pytorch.py (3)
364-364: LGTM: Memory fraction adjustment for test stabilization.Reducing
free_gpu_memory_fractionfrom 0.6 to 0.5 is a reasonable adjustment to avoid OOM issues with the large 70B model in CI environments.
367-368: LGTM: Explicit resource limits improve test stability.Adding
max_seq_len=8192andmax_batch_size=32provides explicit resource bounds that help prevent memory-related test failures while still allowing meaningful accuracy evaluation.
364-369: Well-coordinated parameter adjustments for test reliability.The combined changes (reduced memory fraction + explicit resource limits) form a cohesive approach to stabilizing this resource-intensive 70B model test while preserving its accuracy evaluation purpose. This is preferable to completely skipping the test.
|
/bot run |
|
PR_Github #13166 [ run ] triggered by Bot |
|
PR_Github #13166 [ run ] completed with state |
|
/bot run |
|
PR_Github #13168 [ run ] triggered by Bot |
|
PR_Github #13168 [ run ] completed with state |
Signed-off-by: Xin He (SW-GPU) <[email protected]>
Signed-off-by: Xin He (SW-GPU) <[email protected]>
Signed-off-by: Xin He (SW-GPU) <[email protected]> Signed-off-by: Shreyas Misra <[email protected]>
Signed-off-by: Xin He (SW-GPU) <[email protected]> Signed-off-by: Ransiki Zhang <[email protected]>
Signed-off-by: Xin He (SW-GPU) <[email protected]> Signed-off-by: Lanyu Liao <[email protected]>
waive failed cases
Summary by CodeRabbit