-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[TRTLLM-4501][feat] AutoTuner tuning config refactor and add tuning for kernel configs. #5236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bb2e5bf to
0f5a3e3
Compare
a982cfa to
54cd4fa
Compare
|
/bot run |
|
PR_Github #10964 [ run ] triggered by Bot |
349dc14 to
c5e10cb
Compare
|
/bot run |
|
PR_Github #10979 [ run ] triggered by Bot |
|
PR_Github #10964 [ run ] completed with state |
|
PR_Github #10979 [ run ] completed with state |
c5e10cb to
1e03e95
Compare
|
/bot run |
|
PR_Github #11098 [ run ] triggered by Bot |
|
PR_Github #11098 [ run ] completed with state |
1e03e95 to
a6d1443
Compare
|
/bot run |
|
PR_Github #11109 [ run ] triggered by Bot |
a6d1443 to
e8b3f23
Compare
|
/bot run |
|
PR_Github #11111 [ run ] triggered by Bot |
|
PR_Github #11109 [ run ] completed with state |
|
PR_Github #11111 [ run ] completed with state |
|
/bot run |
|
PR_Github #11168 [ run ] triggered by Bot |
|
PR_Github #11168 [ run ] completed with state |
e8b3f23 to
a661977
Compare
1594f3c to
2bd1b4f
Compare
|
/bot run |
|
PR_Github #11203 [ run ] triggered by Bot |
|
PR_Github #11203 [ run ] completed with state |
2bd1b4f to
7e5acba
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #11234 [ run ] triggered by Bot |
|
PR_Github #11234 [ run ] completed with state |
7e5acba to
e298d0c
Compare
…r kernel configs. Adding a config entry in the tuning config to define the valid candidates for each part of the configs. * AutoTuner will loop over a search grid generated from the config combinations. * Each config will be tuned along with the specific input profile. * The best config will be recorded in the cache value (instead of the cache key). And it will be recovered and used in the tunable runner forward. Other enhancements: * Use the decorator to make the tuning config definition more natural and efficient. This is an independent enhancement. * Allow the user to not speficy the gen_tuning_buckets or the map_to_tuning_buckets function. * Code refactoring. Signed-off-by: Yukun He <[email protected]>
e298d0c to
2626d9b
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #11414 [ run ] triggered by Bot |
|
PR_Github #11414 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #11483 [ run ] triggered by Bot |
|
PR_Github #11483 [ run ] completed with state |
The motivation for this PR is #4872, in which AutoTuner is applied to FP8 batched GEMM op with
tile_sizeandepilog_tile_mto be in the argument list. Generally, there are two possible implementations.We choose the second method: Adding a config entry in the tuning config to define the valid candidates for each part of the config.
Other enhancements: