Fix a 32-bit specific bug in `fgMorphCast` #52828

SingleAccretion · 2021-05-16T23:25:50Z

It used to be that in the importer, whether the cast was to be marked as GTF_UNSIGNED was decided exclusively based on the incoming opcode. However, the flag only makes sense for casts from integral sources, and it turns out morph had a bug where it failed to clear this flag which resulted in
bad codegen.

The bug went as follows: gtMorphCast turns casts from an FP type into a small integer into a chain of casts: CAST(small integer <- FP) => CAST(small integer <- CAST(TYP_INT <- FP)). On 32 bit platforms, the code failed to clear the GTF_UNSIGNED flag from the original tree, which meant that the outer cast thought it had TYP_UINT as the source. This matters for checked casts: conv.ovf.i2.un(-2.0d), which is a legitimate conversion, was interpreted
wrongly as an overflowing one as the resulting codegen only checked the upper bound via an unsigned compare.

The fix is two-fold: clear GTF_UNSIGNED for GT_CAST nodes with FP sources on creation and unify the 64 bit and 32 bit paths in gtMorphCast, which, after the removal of GTF_UNSIGNED, are identical.

This is a zero-diff change across all SPMI collections for Windows x64, Linux x64, Linux ARM64.

This is not a zero-diff change for Windows x86. Instances of bad codegen have been corrected in some tests.

Diffs for the aforementioned tests:

Top method regressions (bytes):
          12 (13.64% of base) : 235214.dasm - ConvTest:test25():short
          12 (13.64% of base) : 235147.dasm - ConvTest:test25():short
          10 (11.63% of base) : 235213.dasm - ConvTest:test24():byte
          10 (11.63% of base) : 235146.dasm - ConvTest:test24():byte
           9 ( 4.46% of base) : 86456.dasm - DevDiv_578217.Program:ILGEN_METHOD(ushort,float,int,ushort,ubyte,short,int):short
           7 ( 2.27% of base) : 86411.dasm - ILGEN_CLASS:ILGEN_METHOD(int):ubyte
           7 (10.45% of base) : 235247.dasm - ConvTest:test25():short
           7 (10.45% of base) : 235181.dasm - ConvTest:test25():short
           5 ( 7.69% of base) : 235246.dasm - ConvTest:test24():byte
           5 ( 2.67% of base) : 84471.dasm - FullProof:Test():int
           5 ( 7.69% of base) : 235180.dasm - ConvTest:test24():byte

Top method improvements (bytes):
         -67 (-33.00% of base) : 86409.dasm - ILGEN_CLASS:ILGEN_METHOD(short,ushort,ushort,int):float

ConvTest (JIT\Methodical\NaN\r8nanconv_il_r\r8nanconv), FullProof (JIT\Regression\VS-ia64-JIT\V2.0-Beta2\b309576\bug2) and DevDiv_578217 diffs are as expected, ~~still investigating what happened in the ILGEN method - it looks like UMOD was not transfomed into CORINFO_HELP_ULMOD~~ - weird code, see below.

SingleAccretion · 2021-05-18T10:14:23Z

So, the UMOD case is weird because the code that is responsible for it is questionable:

runtime/src/coreclr/jit/morph.cpp

Lines 12618 to 12662 in 5a0edac

    
                       /* If this is an unsigned long mod with op2 which is a cast to long from a 
        
                          constant int, then don't morph to a call to the helper.  This can be done 
        
                          faster inline using idiv. 
        
                       */ 
        
                       noway_assert(op2); 
        
                       if ((typ == TYP_LONG) && opts.OptEnabled(CLFLG_CONSTANTFOLD) && 
        
                           ((tree->gtFlags & GTF_UNSIGNED) == (op1->gtFlags & GTF_UNSIGNED)) && 
        
                           ((tree->gtFlags & GTF_UNSIGNED) == (op2->gtFlags & GTF_UNSIGNED))) 
        
                       { 
        
                           if (op2->gtOper == GT_CAST && op2->AsCast()->CastOp()->gtOper == GT_CNS_INT && 
        
                               op2->AsCast()->CastOp()->AsIntCon()->gtIconVal >= 2 && 
        
                               op2->AsCast()->CastOp()->AsIntCon()->gtIconVal <= 0x3fffffff && 
        
                               (tree->gtFlags & GTF_UNSIGNED) == (op2->AsCast()->CastOp()->gtFlags & GTF_UNSIGNED)) 
        
                           { 
        
                               tree->AsOp()->gtOp2 = op2 = fgMorphCast(op2); 
        
                               noway_assert(op2->gtOper == GT_CNS_NATIVELONG); 
        
                           } 
        
                           if (op2->gtOper == GT_CNS_NATIVELONG && op2->AsIntConCommon()->LngValue() >= 2 && 
        
                               op2->AsIntConCommon()->LngValue() <= 0x3fffffff) 
        
                           { 
        
                               tree->AsOp()->gtOp1 = op1 = fgMorphTree(op1); 
        
                               noway_assert(op1->TypeGet() == TYP_LONG); 
        
                               // Update flags for op1 morph 
        
                               tree->gtFlags &= ~GTF_ALL_EFFECT; 
        
                               tree->gtFlags |= (op1->gtFlags & GTF_ALL_EFFECT); // Only update with op1 as op2 is a constant 
        
                               // If op1 is a constant, then do constant folding of the division operator 
        
                               if (op1->gtOper == GT_CNS_NATIVELONG) 
        
                               { 
        
                                   tree = gtFoldExpr(tree); 
        
                               } 
        
                               // We may fail to fold 
        
                               if (!tree->OperIsConst()) 
        
                               { 
        
                                   tree->AsOp()->CheckDivideByConstOptimized(this); 
        
                               } 
        
                               return tree; 
        
                           } 
        
                       }

It appears that the intention was to check whether the operands of UMOD are unsigned, but that doesn't make a lot of sense because GTF_UNSIGNED is not responsible for that, and the fact that the node is a UMOD is already enough - it treats its operands as unsigned. In any case, the diff is caused by this "do GTF_UNSIGNEDs match" logic.
Base:

Before:
               [000028] ------------                            +--*  UMOD      long  
               [000025] ---------U--                            |  +--*  CAST      long <- ulong <- double
               [000024] ------------                            |  |  \--*  NEG       double
               [000023] ------------                            |  |     \--*  LCL_VAR   double V11 loc7         
               [000027] ------------                            |  \--*  CNS_LNG   long   0x000000002487c9b5
                                                                
After:                                                          
               [000028] --CX-+------                            +--*  CALL help long   HELPER.CORINFO_HELP_ULMOD
               [000025] --C--+---U-- arg0                       |  +--*  CALL help long   HELPER.CORINFO_HELP_DBL2ULNG
               [000024] -----+------ arg0                       |  |  \--*  NEG       double
               [000023] -----+------                            |  |     \--*  LCL_VAR   double V11 loc7         
               [000027] -----+------ arg1                       |  \--*  CNS_LNG   long   0x000000002487c9b5

Diff:

Before:
               [000028] ------------                            +--*  UMOD      long  
               [000025] ------------                            |  +--*  CAST      long <- ulong <- double
               [000024] ------------                            |  |  \--*  NEG       double
               [000023] ------------                            |  |     \--*  LCL_VAR   double V11 loc7         
               [000027] ------------                            |  \--*  CNS_LNG   long   0x000000002487c9b5
                                                                
After:                                                          
               [000028] --C--+------                            +--*  UMOD      long  
               [000025] --C--+------                            |  +--*  CALL help long   HELPER.CORINFO_HELP_DBL2ULNG
               [000024] -----+------ arg0                       |  |  \--*  NEG       double
               [000023] -----+------                            |  |     \--*  LCL_VAR   double V11 loc7         
               [000027] ------------                            |  \--*  CNS_LNG   long   0x000000002487c9b5

While it is weird, it should be correct.

Another note on the weird UMOD code: it is testing for a cast from a constant - that may be a remnant of the fact ldc.i4+conv.i8 used to not be folded.

Edit: sure enough, commenting out the "is op2 a cast from a constant" is a no-diff change for Windows x64 and x86.

src/tests/issues.targets

SingleAccretion · 2021-05-18T20:32:40Z

Ready for review.

cc @sandreenko for another morph change...

sandreenko · 2021-05-18T20:54:56Z

Thanks, @SingleAccretion, how did you find this issue?

It is great that you have added a new test for such cases.

I will try to take a look this week, but such change would require a second approval, I think, PTAL @dotnet/jit-contrib

SingleAccretion · 2021-05-18T21:04:44Z

how did you find this issue?

From experimenting with fgMorphCast, I suppose. I noticed that these two branches were doing almost the same thing, made them do the same thing, and while inspecting the diffs realized the (previous) codegen was wrong.

AndyAyersMS

LGTM

sandreenko · 2021-05-21T04:25:57Z

src/coreclr/jit/compiler.hpp

we are calling GenTreeCast constructor:

runtime/src/coreclr/jit/gentree.h

Lines 3531 to 3535 in 4197653

GenTreeCast(var_types type, GenTree* op, bool fromUnsigned, var_types castType DEBUGARG(bool largeNode = false))

: GenTreeOp(GT_CAST, type, op, nullptr DEBUGARG(largeNode)), gtCastType(castType)

{

gtFlags |= fromUnsigned ? GTF_UNSIGNED : GTF_EMPTY;

}

and after that we call CanonicalizeCastNode to clear the flag in some cases, is it worth it?
One option is to do

inline GenTreeCast* Compiler::gtNewCastNode(var_types typ, GenTree* op1, bool fromUnsigned, var_types castType) { if (varTypeIsFloating(cast->CastOp())) { fromUnsigned = false; } GenTreeCast* cast = new (this, GT_CAST) GenTreeCast(typ, op1, fromUnsigned, castType);

or add this logic to GenTreeCast constructor (note it will affect more cases then).

Often I set conditional breakpoints on flag values so I would prefer not to change it back and forth without a reason.

Have you tried to add assert(!varTypeIsFloating(cast->CastOp() || !fromUnsigned) and fix the callers who call it with wrong value of bool fromUnsigned?

Have you tried to add assert(!varTypeIsFloating(cast->CastOp() || !fromUnsigned) and fix the callers who call it with wrong value of bool fromUnsigned?

Yes, in fact that was the original version of the fix, and the only place I had to fix was the importer. I went back and forth on this a few times, and ultimately decided to have the change initially be with this more conservative implementation, because it is "guaranteed" (not really because I did not add it to the constructor, but nothing calls the constructor directly) to always return "correct" result to the caller. I suppose I should revisit it and run the assertty version through the CI. Maybe it turns out more callers are getting it wrong and it would be better to normalize centrally (regardless of that, the flag flipping will be fixed).

I suppose I should revisit it and run the assertty version through the CI. Maybe it turns out more callers are getting it wrong and it would be better to normalize centrally (regardless of that, the flag flipping will be fixed).

I would prefer this if you have time, if not please open a follow-up issue to investigate/clean it and we will merge this PR as is.

I would prefer this

I would as well, so we'll investigate it as part of this PR.

Verifying that checked '.un' casts from floats to small types are not treated as casts from unsigned types.

It used to be that in the importer, whether the cast was to be marked as GTF_UNSIGNED was decided exclusively based on the incoming opcode. However, the flag only makes sense for casts from integral sources, and it turns out morph had a bug where it failed to clear this flag which resulted in bad codegen. The bug went as follows: "gtMorphCast" turns casts from an FP type to a small integer into a chain of casts: CAST(small integer <- FP) => CAST(small integer <- CAST(TYP_INT <- FP)). On 32 bit platforms, the code failed to clear the GTF_UNSIGNED flag from the original tree, which meant that the outer cast thought it had TYP_UINT as the source. This matters for checked casts: conv.ovf.i2.un(-2.0d), which is a legitimate conversion, was interpreted wrongly as an overflowing one as the resulting codegen only checked the upper bound via an unsigned compare. The fix is two-fold: clear GTF_UNSIGNED for GT_CAST nodes with FP sources on creation and unify the 64 bit and 32 bit paths in "gtMorphCast", which, after the removal of GTF_UNSIGNED handling, are identical. This is a zero-diff change across all SPMI collections for Windows x64, Linux x64, Linux ARM64. This **is not** a zero-diff change for Windows x86. Instances of bad codegen have been corrected in some tests.

Instead of normalizing GTF_UNSIGNED for FP sources in "gtNewCastNode", assert that it is not set in GenTreeCast's constructor and fix the importer to respect that constraint.

sandreenko

LGTM, thank you!

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 16, 2021

SingleAccretion force-pushed the Simplify-fgMorphCast-Phase-1 branch 2 times, most recently from f9e7b3e to 5a245e5 Compare May 18, 2021 09:52

SingleAccretion commented May 18, 2021

View reviewed changes

src/tests/issues.targets Outdated Show resolved Hide resolved

SingleAccretion closed this May 18, 2021

SingleAccretion reopened this May 18, 2021

SingleAccretion force-pushed the Simplify-fgMorphCast-Phase-1 branch from 5a245e5 to 676fd81 Compare May 18, 2021 13:54

SingleAccretion closed this May 18, 2021

SingleAccretion reopened this May 18, 2021

sandreenko marked this pull request as ready for review May 18, 2021 20:52

AndyAyersMS approved these changes May 21, 2021

View reviewed changes

sandreenko reviewed May 21, 2021

View reviewed changes

SingleAccretion added 2 commits May 21, 2021 23:51

Added a test

9d9e769

Verifying that checked '.un' casts from floats to small types are not treated as casts from unsigned types.

SingleAccretion force-pushed the Simplify-fgMorphCast-Phase-1 branch from a0ff6e3 to 395d829 Compare May 21, 2021 20:53

Assert instead of normalizing

a1a0850

Instead of normalizing GTF_UNSIGNED for FP sources in "gtNewCastNode", assert that it is not set in GenTreeCast's constructor and fix the importer to respect that constraint.

SingleAccretion force-pushed the Simplify-fgMorphCast-Phase-1 branch from 395d829 to a1a0850 Compare May 21, 2021 20:57

sandreenko approved these changes May 23, 2021

View reviewed changes

sandreenko merged commit 34e6bb4 into dotnet:main May 24, 2021

sandreenko mentioned this pull request May 24, 2021

ECMA is not clear about conv.ovf.i1.un behavior. #53189

Closed

SingleAccretion deleted the Simplify-fgMorphCast-Phase-1 branch May 26, 2021 20:03

ghost locked as resolved and limited conversation to collaborators Jun 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix a 32-bit specific bug in `fgMorphCast` #52828

Fix a 32-bit specific bug in `fgMorphCast` #52828

Uh oh!

SingleAccretion commented May 16, 2021 •

edited

Loading

Uh oh!

SingleAccretion commented May 18, 2021 •

edited

Loading

Uh oh!

Uh oh!

SingleAccretion commented May 18, 2021

Uh oh!

sandreenko commented May 18, 2021 •

edited

Loading

Uh oh!

SingleAccretion commented May 18, 2021

Uh oh!

AndyAyersMS left a comment

Uh oh!

sandreenko May 21, 2021

Uh oh!

SingleAccretion May 21, 2021

Uh oh!

sandreenko May 21, 2021

Uh oh!

SingleAccretion May 21, 2021

Uh oh!

sandreenko left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	GenTreeCast(var_types type, GenTree* op, bool fromUnsigned, var_types castType DEBUGARG(bool largeNode = false))
	: GenTreeOp(GT_CAST, type, op, nullptr DEBUGARG(largeNode)), gtCastType(castType)
	{
	gtFlags \|= fromUnsigned ? GTF_UNSIGNED : GTF_EMPTY;
	}

Fix a 32-bit specific bug in fgMorphCast #52828

Fix a 32-bit specific bug in fgMorphCast #52828

Uh oh!

Conversation

SingleAccretion commented May 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SingleAccretion commented May 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

SingleAccretion commented May 18, 2021

Uh oh!

sandreenko commented May 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SingleAccretion commented May 18, 2021

Uh oh!

AndyAyersMS left a comment

Choose a reason for hiding this comment

Uh oh!

sandreenko May 21, 2021

Choose a reason for hiding this comment

Uh oh!

SingleAccretion May 21, 2021

Choose a reason for hiding this comment

Uh oh!

sandreenko May 21, 2021

Choose a reason for hiding this comment

Uh oh!

SingleAccretion May 21, 2021

Choose a reason for hiding this comment

Uh oh!

sandreenko left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix a 32-bit specific bug in `fgMorphCast` #52828

Fix a 32-bit specific bug in `fgMorphCast` #52828

SingleAccretion commented May 16, 2021 •

edited

Loading

SingleAccretion commented May 18, 2021 •

edited

Loading

sandreenko commented May 18, 2021 •

edited

Loading