Don't use an array when a simple shift/bit scan will do #79493
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
instrDesc::idOpSize()andinstrDesc::idOpSize(emitAttr opsz)were using an array lookup when a simple shift or bit scan would have sufficed since all inputs are powers of two.On MSVC, the codegen changes from:
To:
As you can see, for
idOpSizerather than doing 3 memory lookups we now just do the one lookup and ashl. Likewise, foridOpSize(emitAttr)rather than doing 4 memory accesses we now just do 2 with a simplebsf(effectively alzcnt).The memory accesses, even in the best case scenario where they were cached, ended up being fairly complex both in terms of the addressing modes required to resolve them but also in the number of indirections required to access the data. Each indirection, even in the case of L1 data would take approx 4 cycles to resolve.
shlandbsfin comparison are both highly optimized instructions that typically take 1-4 cycles to compute (depending on target CPU).This also marks
idOpSize()asconstso the compiler can understand it is non-mutating and only reading a field.