-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Closed
Labels
ci-failureIssue about an unexpected test failure in CIIssue about an unexpected test failure in CI
Description
Name of failing test
lora/test_llama_tp.py::test_tp2_serialize_and_deserialize_lora
Basic information
- Flaky test
- Can reproduce locally
- Caused by external libraries (e.g. bug in
transformers)
🧪 Describe the failing test
https://buildkite.com/vllm/ci/builds/23536/steps/canvas?sid=0197f0f3-a191-49c0-aef5-89d61c597808
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) WARNING 07-09 16:17:11 [tensorizer.py:226] Provided both tensorizer_dir and tensorizer_uri. Inferring tensorizer_dir from tensorizer_uri as the latter takes precedence.
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] Traceback (most recent call last):
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 461, in worker_main
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] worker = WorkerProc(*args, **kwargs)
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 358, in __init__
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] self.worker.load_model()
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 186, in load_model
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] self.model_runner.load_model()
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1773, in load_model
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] model_loader = get_model_loader(self.load_config)
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/__init__.py", line 33, in get_model_loader
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] return TensorizerLoader(load_config)
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/tensorizer_loader.py", line 45, in __init__
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] self.tensorizer_config = TensorizerConfig(
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] ^^^^^^^^^^^^^^^^^
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "<string>", line 16, in __init__
[2025-07-09T23:17:11Z] (VllmWorker rank=0 pid=11292) ERROR 07-09 16:17:11 [multiproc_executor.py:487] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/tensorizer.py", line 232, in __post_init__
📝 History of failing test
It seems like this failure was introduced by #19619 as it introduced this check in the tensorizer.py
if self.tensorizer_dir and self.lora_dir:
raise ValueError(
"Only one of tensorizer_dir or lora_dir may be specified. "
"Use lora_dir exclusively when serializing LoRA adapters, "
"and tensorizer_dir or tensorizer_uri otherwise.")It seems the failing test here wasn't triggered by the conditional check
CC List.
@sangstar @Eta0 @aarnphm @jeejeelee please take a look
Metadata
Metadata
Assignees
Labels
ci-failureIssue about an unexpected test failure in CIIssue about an unexpected test failure in CI
Type
Projects
Status
Done