Skip to content

Commit 052788e

Browse files
Isotr0pyweilong.yu
authored andcommitted
[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (vllm-project#10394)
Signed-off-by: Isotr0py <[email protected]>
1 parent 03e6bc2 commit 052788e

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

vllm/worker/cpu_embedding_model_runner.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,10 @@ def execute_model(
6666

6767
hidden_states = model_executable(**execute_model_kwargs)
6868

69+
# Only perform pooling in the driver worker.
70+
if not self.is_driver_worker:
71+
return []
72+
6973
return [
7074
self.model.pooler(hidden_states=hidden_states,
7175
pooling_metadata=model_input.pooling_metadata)

0 commit comments

Comments
 (0)