Skip to content

Conversation

@MasterJH5574
Copy link
Contributor

Following up on a previous PR, this PR introduces the cast and reinterpret support between __nv_fp4_e2m1 and other dtypes. This PR also makes sure that the cast and reinterpret support vectorize.

@MasterJH5574 MasterJH5574 force-pushed the tvm-dev/2025-03-05-fp4-cast-reinterpret branch 2 times, most recently from fff8e1b to bb1a492 Compare March 6, 2025 03:54
Following up on a previous PR, this PR introduces the cast and
reinterpret support between `__nv_fp4_e2m1` and other dtypes.
This PR also makes sure that the cast and reinterpret support
vectorize.
@MasterJH5574 MasterJH5574 force-pushed the tvm-dev/2025-03-05-fp4-cast-reinterpret branch from bb1a492 to 448eb23 Compare March 6, 2025 05:09
@tqchen tqchen merged commit c19e5f4 into apache:main Mar 6, 2025
14 checks passed
ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025
* [CUDA] FP4 cast and reinterpret support

Following up on a previous PR, this PR introduces the cast and
reinterpret support between `__nv_fp4_e2m1` and other dtypes.
This PR also makes sure that the cast and reinterpret support
vectorize.

* change to float4_e2m1fn
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants