Skip to content

Commit 24e7f4e

Browse files
authored
[nvbug/5410296][fix] Fix OOM in Llama 4 disagg-serve tests (#6439)
Signed-off-by: Bo Deng <[email protected]>
1 parent 9632dba commit 24e7f4e

File tree

3 files changed

+3
-3
lines changed

3 files changed

+3
-3
lines changed

tests/integration/defs/accuracy/test_disaggregated_serving.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -413,6 +413,9 @@ def test_auto_dtype(self, overlap_scheduler):
413413
gen_server_config = {"disable_overlap_scheduler": overlap_scheduler}
414414
ctx_server_config["cache_transceiver_config"] = {"backend": "default"}
415415
gen_server_config["cache_transceiver_config"] = {"backend": "default"}
416+
# Keep this low to avoid warmup OOM in CI
417+
ctx_server_config["max_seq_len"] = 8192
418+
gen_server_config["max_seq_len"] = 8192
416419
disaggregated_server_config = {
417420
"hostname": "localhost",
418421
"port": 8000,

tests/integration/defs/disaggregated/test_disaggregated.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -745,7 +745,6 @@ def test_disaggregated_deepseek_v3_lite_fp8_ucx(disaggregated_test_root,
745745
cwd=llm_venv.get_working_directory())
746746

747747

748-
@skip_no_hopper
749748
@skip_arm
750749
@pytest.mark.parametrize("deepseek_v3_model_root", ['DeepSeek-V3-Lite-fp8'],
751750
indirect=True)

tests/integration/test_lists/waives.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -419,8 +419,6 @@ accuracy/test_llm_api_pytorch.py::TestLlama3_1_8BInstruct::test_ngram SKIP (http
419419
test_e2e.py::test_openai_multi_chat_example SKIP (https://nvbugs/5409416)
420420
test_e2e.py::test_ptp_quickstart_multimodal[llava-v1.6-mistral-7b-llava-v1.6-mistral-7b-hf-image-False] SKIP (https://nvbugs/5409417)
421421
test_e2e.py::test_ptp_star_attention_example[Llama3.1-8B-BF16-llama-3.1-model/Meta-Llama-3.1-8B] SKIP (https://nvbugs/5409420)
422-
accuracy/test_disaggregated_serving.py::TestLlama4ScoutInstruct::test_auto_dtype[False] SKIP (https://nvbugs/5410296)
423-
accuracy/test_disaggregated_serving.py::TestLlama4ScoutInstruct::test_auto_dtype[True] SKIP (https://nvbugs/5410296)
424422
llmapi/test_llm_examples.py::test_llmapi_speculative_decoding_mtp SKIP (https://nvbugs/5410399)
425423
unittest/trt/attention/test_gpt_attention.py -k "partition0" SKIP (https://nvbugs/5412456)
426424
unittest/trt/attention/test_gpt_attention.py -k "partition1" SKIP (https://nvbugs/5412456)

0 commit comments

Comments
 (0)