-
Notifications
You must be signed in to change notification settings - Fork 344
Add quantized embedding kernels to torchao #1018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1018
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 2169c32 with merge base 7aaf0ff ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D63839255 |
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
cceed9e
to
d596bcb
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
d596bcb
to
10ff165
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
10ff165
to
99ca201
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
99ca201
to
49363f4
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
49363f4
to
aabe6db
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
aabe6db
to
17e2deb
Compare
@jerryzh168 if things look good to you, can you approve the diff too D63839255 |
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
17e2deb
to
7ef09aa
Compare
@metascroy I'm mostly just stamping the ao/experimental PR, do you need a proper review or just stamps for the diff? |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
7ef09aa
to
8b62d71
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
8b62d71
to
3b92449
Compare
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
3b92449
to
8361f53
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
8361f53
to
a719b34
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
a719b34
to
4aed4c2
Compare
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Differential Revision: D63839255
4aed4c2
to
e5368f2
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Reviewed By: digantdesai Differential Revision: D63839255
e5368f2
to
cd0f40f
Compare
This pull request was exported from Phabricator. Differential Revision: D63839255 |
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Reviewed By: digantdesai Differential Revision: D63839255
cd0f40f
to
7af0756
Compare
Summary: Pull Request resolved: pytorch#1018 This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels. Reviewed By: digantdesai Differential Revision: D63839255
This pull request was exported from Phabricator. Differential Revision: D63839255 |
7af0756
to
2169c32
Compare
Summary: This improves best tokens/sec from 73 to 85. Co-authored-by: Jack-Khuu <[email protected]>
Summary: This diff adds lowbit embedding kernels to torchao. These reuse the same bitpacking code as the linear kernels.
Differential Revision: D63839255