-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Closed
Labels
arch-arm64area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMItenet-performancePerformance related issuePerformance related issue
Milestone
Description
Today we don't CSE loading the target of indirect cell address because during CSE we don't have that information in the IR. It happens in later phase like lower.
Consider the following code pattern:
...
9000000B adrp x11, [RELOC #0x1de756537c0]
9100016B add x11, x11, #0
F9400160 ldr x0, [x11]
D63F0000 blr x0
...
9000000B adrp x11, [RELOC #0x1de756537c0]
9100016B add x11, x11, #0
F9400160 ldr x0, [x11]
D63F0000 blr x0
AA0003EF mov x15, x0
...If we can optimize it using peephole or more some ambitious final instructions scanner phase to something like this to:
...
9000000B adrp x11, [RELOC #0x1de756537c0]
9100016B add x11, x11, #0
F9400160 ldr x0, [x11]
mov xR, x0 ; store x0 in some register xR
D63F0000 blr x0
...
mov x0, xR ; retrieve xR into x0
D63F0000 blr x0
AA0003EF mov x15, x0
...With this, we can get an improvement of 8 bytes + 1 elimination of memory access.
I wrote an analyzer asm to find out how many addresses are CSE candidates and the number is huge. From what I noticed, it would by little over 2MB of size reduction.
Processed 191816 methods. Found 29246 methods containing 259123 groups.
Details: cse-candidates.txt
category:cq
theme:cse
skill-level:expert
cost:large
impact:large
Metadata
Metadata
Assignees
Labels
arch-arm64area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMItenet-performancePerformance related issuePerformance related issue