Skip to content

Conversation

@sdesmalen-arm
Copy link
Collaborator

Inlining may result in different behaviour when the callee clobbers ZT0, because normally the call-site will have code to preserve ZT0. When inlining the function this code to preserve ZT0 will no longer be emitted, and so the resulting behaviour of the program is changed.

Inlining may result in different behaviour when the callee clobbers ZT0,
because normally the call-site will have code to preserve ZT0. When inlining
the function this code to preserve ZT0 will no longer be emitted, and so the
resulting behaviour of the program is changed.
@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-aarch64

Author: Sander de Smalen (sdesmalen-arm)

Changes

Inlining may result in different behaviour when the callee clobbers ZT0, because normally the call-site will have code to preserve ZT0. When inlining the function this code to preserve ZT0 will no longer be emitted, and so the resulting behaviour of the program is changed.


Full diff: https://github.com/llvm/llvm-project/pull/101343.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+2-1)
  • (modified) llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll (+45)
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 7de813f603264..ab81af7076f3a 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -254,7 +254,8 @@ bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
     return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-      CallerAttrs.requiresSMChange(CalleeAttrs)) {
+      CallerAttrs.requiresSMChange(CalleeAttrs) ||
+      CallerAttrs.requiresPreservingZT0(CalleeAttrs)) {
     if (hasPossibleIncompatibleOps(Callee))
       return false;
   }
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
index 816492768cc0f..5e638103a2b06 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
@@ -231,6 +231,51 @@ define void @shared_za_caller_private_za_callee_call_tpidr2_restore_dont_inline(
   ret void
 }
 
+define void @nonzt0_callee() {
+; CHECK-LABEL: define void @nonzt0_callee
+; CHECK-SAME: () #[[ATTR0]] {
+; CHECK-NEXT:    call void asm sideeffect "
+; CHECK-NEXT:    call void @inlined_body()
+; CHECK-NEXT:    ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_nonzt0_callee_dont_inline() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_caller_nonzt0_callee_dont_inline
+; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+; CHECK-NEXT:    call void @nonzt0_callee()
+; CHECK-NEXT:    ret void
+;
+  call void @nonzt0_callee()
+  ret void
+}
+
+define void @shared_zt0_callee() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_callee
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:    call void asm sideeffect "
+; CHECK-NEXT:    call void @inlined_body()
+; CHECK-NEXT:    ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_shared_zt0_callee_inline() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_caller_shared_zt0_callee_inline
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:    call void asm sideeffect "
+; CHECK-NEXT:    call void @inlined_body()
+; CHECK-NEXT:    ret void
+;
+  call void @shared_zt0_callee()
+  ret void
+}
+
 declare void @__arm_za_disable()
 declare void @__arm_tpidr2_save()
 declare void @__arm_tpidr2_restore(ptr)

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth testing the other zt0 attributes? I ask not because of an issue with this patch, which looks fine to my eye, but because I'm less sure requiresPreservingZT0 is valid. That said, this function is new to me so perhaps I've misunderstood.

For the case where the caller is aarch64_inout_zt0 or aarch64_in_zt0 should zt0 be preserved when calling a aarch64_out_zt0 function? requiresPreservingZT0 suggests not?

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. The answer is that when a caller has zt0 state calls a function that uses any of the in/out/inout_zt0 keywords it becomes the developer's responsibility to preserve zt0.

@sdesmalen-arm sdesmalen-arm merged commit fb470db into llvm:main Aug 2, 2024
@sdesmalen-arm sdesmalen-arm added this to the LLVM 19.X Release milestone Aug 5, 2024
@sdesmalen-arm
Copy link
Collaborator Author

/cherry-pick fb470db

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Aug 5, 2024
Inlining may result in different behaviour when the callee clobbers ZT0,
because normally the call-site will have code to preserve ZT0. When
inlining the function this code to preserve ZT0 will no longer be
emitted, and so the resulting behaviour of the program is changed.

(cherry picked from commit fb470db)
@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2024

/pull-request #101932

tru pushed a commit to llvmbot/llvm-project that referenced this pull request Aug 10, 2024
Inlining may result in different behaviour when the callee clobbers ZT0,
because normally the call-site will have code to preserve ZT0. When
inlining the function this code to preserve ZT0 will no longer be
emitted, and so the resulting behaviour of the program is changed.

(cherry picked from commit fb470db)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants