- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
JIT ARM64-SVE: Add AbsoluteCompare* APIs #102611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @kunalspathak @dotnet/arm64-contrib @a74nh | 
| Note regarding the   | 
| Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics | 
| 
 Thanks. What about stress test results? | 
| if (intrin.op3->IsVectorZero()) | ||
| if (!instrIsRMW) | ||
| { | ||
| assert(intrin.op3->IsVectorZero()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intrin.op3 should be nullptr because there are just 2 operands involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this context intrin is Sve_ConditionalSelect, not Sve_AbsoluteCompare*.
| ("SveSimpleVecOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_BooleanNot_uint", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "BooleanNot", ["RetVectorType"] = "Vector", ["RetBaseType"] = "UInt32", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "UInt32", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetUInt32()", ["ValidateIterResult"] = "Helpers.BooleanNot(firstOp[i]) != result[i]", ["GetIterResult"] = "Helpers.BooleanNot(leftOp[i])"}), | ||
| ("SveSimpleVecOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_BooleanNot_ulong", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "BooleanNot", ["RetVectorType"] = "Vector", ["RetBaseType"] = "UInt64", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "UInt64", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetUInt64()", ["ValidateIterResult"] = "Helpers.BooleanNot(firstOp[i]) != result[i]", ["GetIterResult"] = "Helpers.BooleanNot(leftOp[i])"}), | ||
|  | ||
| // Sve mask | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Sve mask | 
| 
 Thanks, missed those. There are several errors actually. I'll take a look.  | 
| After 4b5a0f8 the stress tests now pass successfully. The issue arises when emitting  To address this issue, the commit 4b5a0f8 conditionally changes the type of the  We've also discussed two other approaches with @a74nh and @SwapnilGaikwad . At the moment  
 I haven't tested these approaches yet. Stress tests' results:  | 
| 
 This sounds reasonable to me. By doing this, can we add extra test coverage? For e.g. can we land up with https://docsmirror.github.io/A64/2023-06/sel_p_p_pp.html . Or what happens when we use something like this: ConditionalSelect(AbsoluteCompare(left, right), left, right));
LoadVector(AbsoluteCompare(left, right), address)); | 
| @kunalspathak , thank you for the review. I'm happy to add additional test for the APIs. 
 Are you asking for a tests that checks the emitted instructions for  
 Do you want these two to be added for all APIs that use  | 
| 
 I think this falls in the category of #103078, so we can probably skip doing it. 
 I think we should do it at least for one of the  | 
| Hello @kunalspathak , I've added the requested tests. Please let me know if I interpreted something incorrectly. | 
613054c    to
    629bb75      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to do some validation around a bit to make sure that it is handled correctly.
| GenTree* trueVal = node; | ||
| GenTree* falseVal = comp->gtNewZeroConNode(simdType); | ||
| var_types nodeType = simdType; | ||
| if (HWIntrinsicInfo::ReturnsPerElementMask(node->GetHWIntrinsicId())) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to test this out to make sure this is correct. I will pull your branch once #103288 is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing the test DOTNET_TieredCompilation=0:
Assert failure(PID 18916 [0x000049e4], Thread: 28152 [0x6df8]): Assertion failed 'unreached' in 'JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_float:ConditionalSelect_MethodMask():this' during 'Generate code' (IL size 109; hash 0x5f5a6ecd; FullOpts)
    File: D:\git\runtime\src\coreclr\jit\emitarm64sve.cpp:4396
    Image: D:\kpathak\Core_Root_absolutecompare\Core_Root\corerun.exe
This is missing a piece where if the cndSelNode is of type TYP_MASK, then the falseValue should also be TYP_MASK. Today, we do not support ZeroConNode of TYP_MASK, so I was thinking of wrapping falseVal with ConvertVectorToMask(), but for some reason that was failing in LIR validation that I didn't get a chance to verify. I will look tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@a74nh, @mikabl-arm  - I have fixed code to handle falseValue of TYP_MASK. PTAL.
Stress tests are passing: https://gist.github.com/kunalspathak/95ddd4913a6642bcbda7ddff9c33f752
The failure case RunLoadMask is because we are not preserving p0 across call (known problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
629bb75    to
    8375ac5      
    Compare
  
    Co-authored-by: Kunal Pathak <[email protected]>
| Did you verify the test after latest merge? I see this failure on   | 
…nalSelect in another ConditionalSelect if required
| Thank you for catching this, @kunalspathak ! Indeed, looks like I have missed something after adding the two requested new tests. I'll do my best to be more thorough next time 😄 Pushed a fix to the branch: 1a2329e. All stress tests for   | 
        
          
                src/coreclr/jit/lsraarm64.cpp
              
                Outdated
          
        
      | if (intrin.op2->OperIs(GT_HWINTRINSIC) && | ||
| (intrin.op2->AsHWIntrinsic()->GetHWIntrinsicId() == NI_Sve_ConvertVectorToMask)) | ||
| { | ||
| // Have RBM_ALLMASK candidates only if op2 is VectorToMask | ||
| assert(intrin.op2->gtType == TYP_MASK); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this? It's not clear why TYP_MASK would be restricted but VectorToMask would not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, we end up assigning RBM_ALLMASK candidates for operands of AbsoluateCompare(), but they should be assigned RBM_ALLFLOAT candidates.
| // Wrap the intrinsic in ConditionalSelect only if it is not already inside another ConditionalSelect | ||
| if (!user->OperIsHWIntrinsic() || (user->AsHWIntrinsic()->GetHWIntrinsicId() != NI_Sve_ConditionalSelect)) | ||
| // If it is inside ConditionalSelect, then make sure that it is the `mask` operation of it. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow this entirely...
This is stating we need to wrap in a CndSel if the user isn't a CndSel or if it is but we're not the mask operand.
The !user->OperIsHWIntrinsic(NI_Sve_ConditionalSelect) seems fine with that regard but the HWIntrinsicInfo::ReturnsPerElementMask(node->GetHWIntrinsicId()) doesn't seem to fit.
Most notably we can have mask = CndSel(mask1, mask2, mask3) or vector CndSel(mask, vector1, vector2), but also I would have expected that intrinsics like Abs(x) which must be emitted with a mask to need CndSel regardless of where it appears and only that we could optimize it away in some cases like vector = CndSel(mask, vector1, CndSel(mask, vector2, vector3))  (where the outer and inner cndsel use identical masks that mean the latter CndSel becomes redundant as the computed operands would never be selected anyways).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I would have expected the latter condition to basically be conditioned differently, basically ensuring that intrinsics that require a CndSel wrapper always, optionally trying to fold in the case the mask usages can be detected as redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to rewrite the comment to avoid confusion I guess:
          // Wrap the intrinsic in ConditionalSelect only if it is not already inside another ConditionalSelect
          // or if it is but it is the `mask` operand.The goal of this block of code is to wrap HWIntrinsics which satisfy HWIntrinsicInfo::IsEmbeddedMaskedOperation(intrinsicId)) into ConditionalSelect. Usually this only needs to be done when the intrinsic is not already wrapped. But this logic doesn't hold when such intrinsic is used as the mask operand for ConditionalSelect - user->AsHWIntrinsic()->GetHWIntrinsicId() != NI_Sve_ConditionalSelect doesn't hold true any more, but the intrinsic still has to be wrapped ConditionalSelect (all that satisfy HWIntrinsicInfo::IsEmbeddedMaskedOperation(intrinsicId) have to).
| Replaced by #104464 | 
Tests results: