-
Notifications
You must be signed in to change notification settings - Fork 45
Support per-row scaling #424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
4214366
to
133a192
Compare
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
133a192
to
3cddb89
Compare
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
3cddb89
to
c932b68
Compare
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Differential Revision: D82516347
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: njriasan Differential Revision: D82516347
c932b68
to
dcad224
Compare
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: njriasan Differential Revision: D82516347
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: njriasan Differential Revision: D82516347
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: njriasan Differential Revision: D82516347
dcad224
to
df34b51
Compare
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: njriasan Differential Revision: D82516347
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: NikhilAPatel, njriasan Differential Revision: D82516347
df34b51
to
47062df
Compare
@jananisriram has exported this pull request. If you are a Meta employee, you can view the originating diff in D82516347. |
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization. Reviewed By: NikhilAPatel, njriasan Differential Revision: D82516347
Summary: Support per-row scaling for the FP8 Blackwell persistent + TMA kernel with warp specialization.
Differential Revision: D82516347