Skip to content

Conversation

electron271
Copy link

https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html most non instinct gpus support 32 warp size

tested on RX 9070 XT, looking into getting this tested on amd instinct accelerators to ensure gpus with 64 warp size still work

@matthewdouglas
Copy link
Member

Thanks for the PR! I don't have the bandwidth to test this personally at the moment, so will defer to AMD team. Also I do not have any RDNA GPUs on hand.

cc: @pnunna93

Copy link

github-actions bot commented Sep 9, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@pnunna93 pnunna93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! It's good to go once warp size change is made.

@matthewdouglas
Copy link
Member

Hi @electron271
There's still a couple conflicts, mostly because we removed all of the imports related to IPEX. If you don't mind fixing those I think we can merge after that! Thanks!

matthewdouglas
matthewdouglas previously approved these changes Oct 3, 2025
@matthewdouglas matthewdouglas added this to the v0.49.0 milestone Oct 3, 2025
@electron271
Copy link
Author

will look through all this soon, sorry have been somewhat busy

#define DENORM 1.0f/127.0f
#define MAX_SPARSE_COUNT 32
#define SMEM_SIZE 8*256
#if defined(__GFX9__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused now and can be removed.


#define ERR_NOT_IMPLEMENTED 100

#if defined(__GFX9__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused now and can be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants