[Hardware][TPU] Implement tensor parallelism with Ray #5871

WoosukKwon · 2024-06-26T20:07:04Z

This PR implements Ray TPU executor for distributed inference support on TPU.

WoosukKwon · 2024-06-26T21:36:43Z

~~For this PR, I will merge it after getting reviews. :)~~

The changes outside the TPU backend was reviewed in #6812 and #6813.

) Signed-off-by: Alvant <[email protected]>

) Signed-off-by: LeiWang1999 <[email protected]>

WoosukKwon added 20 commits June 24, 2024 01:53

Add & warnings

76fc072

Add in dummy_run

27a5ad8

Add is_driver_worker

5ab6f65

Make TPUExecutor similar to GPUExecutor

c4e79a0

Add multiprocessing-based TPU executor

ff81993

Use TPU to initialize Ray cluster

16e80b2

Add pjrt proc init

05884ce

Add Ray TPU executor

20d23eb

Use Ray TPU executor for tp

5d4df21

Minor

6b2c76c

Fix TPUWorker.execute_model

d91446b

Add is_driver_worker & input broadcast

ab1595d

Call xm._init_world_size_ordinal

4b45393

Bug fix on vocab

86451a2

Use all gather for TPU

0539299

Support TPU in GroupCoordinator

b35917c

Delete multiproc TPU executor

b9a84bc

Minor

c756b76

[Bugfix][TPU] Fix CPU cache allocation & swapping

16e9934

Merge branch 'fix-tpu-swpa' into tpu-n

e25f470

WoosukKwon added the tpu Related to Google TPUs label Jun 26, 2024

WoosukKwon added 2 commits June 26, 2024 20:15

yapf

ca6d1d6

Add Ray to TPU dependency

cd4f68d

WoosukKwon changed the title ~~[Hardware][TPU] Support tensor parallelism with Ray~~ [Hardware][TPU] Implement tensor parallelism with Ray Jun 26, 2024

WoosukKwon added 3 commits June 26, 2024 20:44

Merge branch 'main' into tpu-n

5df4164

Fix

546987a

Fix

330be6e

WoosukKwon added 2 commits June 29, 2024 05:42

Merge branch 'main' into tpu-n

b45ed24

Add use_all_gather to LoRA

8fab9fd

WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 21, 2024

WoosukKwon added 9 commits July 21, 2024 10:26

Fix patch

dcb63b7

Fix typo

825cc44

Merge branch 'main' into tpu-n

f27ef99

Remove inference_mode

9730288

Add no_grad

631b08b

Merge branch 'main' into tpu-n

d65a7d0

Merge branch 'main' into tpu-n

755fe0b

Merge branch 'main' into tpu-n

d5fadfd

[TPU] Support collective communications in XLA devices

af3a259

This was referenced Jul 26, 2024

[TPU] Support collective communications in XLA devices #6813

Merged

[Misc][TPU] Support TPU in initialize_ray_cluster #6812

Merged

WoosukKwon added 11 commits July 26, 2024 02:25

Use current_platform

0f2abea

is_xla -> is_tpu

8ebea7e

Define TPU communicator

782b182

Merge branch 'main' into tpu-n

76fd300

Merge branch 'add-xla-comm' into tpu-n

75f842b

Fix

8087227

Address comments

f04e179

Device init

f493c89

Fix patch

f14b085

Merge branch 'add-xla-comm' into tpu-n

1668582

Merge branch 'main' into tpu-n

a05cf0f

WoosukKwon merged commit 52f07e3 into main Jul 27, 2024

WoosukKwon deleted the tpu-n branch July 27, 2024 03:54

dtrifiro mentioned this pull request Aug 5, 2024

Sync with [email protected] opendatahub-io/vllm#120

Closed

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Hardware][TPU] Implement tensor parallelism with Ray (vllm-project#5871

08963df

) Signed-off-by: Alvant <[email protected]>

LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025

[Hardware][TPU] Implement tensor parallelism with Ray (vllm-project#5871

eac7762

) Signed-off-by: LeiWang1999 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Hardware][TPU] Implement tensor parallelism with Ray #5871

[Hardware][TPU] Implement tensor parallelism with Ray #5871

Uh oh!

WoosukKwon commented Jun 26, 2024 •

edited

Loading

Uh oh!

WoosukKwon commented Jun 26, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[Hardware][TPU] Implement tensor parallelism with Ray #5871

[Hardware][TPU] Implement tensor parallelism with Ray #5871

Uh oh!

Conversation

WoosukKwon commented Jun 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WoosukKwon commented Jun 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

WoosukKwon commented Jun 26, 2024 •

edited

Loading

WoosukKwon commented Jun 26, 2024 •

edited

Loading