Skip to content

Conversation

@boomanaiden154
Copy link
Contributor

This patch makes it so that builtins for the aux triple only get enabled if they are marked as supported by the current language options. Otherwise we define them regardless of the language mode. This can cause issues in certain scenarios, like if someone has a definition that conflicts with a builtin gated behind __has_builtin, because __has_builtin will return false given the builtin is only enabled on the aux triple despite the builtin actually being enabled.

This is best exemplified with the __cpuidex conflict test.

Fixes #152558.

This patch makes it so that builtins for the aux triple only get enabled
if they are marked as supported by the current language options.
Otherwise we define them regardless of the language mode. This can cause
issues in certain scenarios, like if someone has a definition that
conflicts with a builtin gated behind __has_builtin, because
__has_builtin will return false given the builtin is only enabled on the
aux triple despite the builtin actually being enabled.

This is best exemplified with the __cpuidex conflict test.

Fixes llvm#152558.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:headers Headers provided by Clang, e.g. for intrinsics labels Aug 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 18, 2025

@llvm/pr-subscribers-backend-x86

Author: Aiden Grossman (boomanaiden154)

Changes

This patch makes it so that builtins for the aux triple only get enabled if they are marked as supported by the current language options. Otherwise we define them regardless of the language mode. This can cause issues in certain scenarios, like if someone has a definition that conflicts with a builtin gated behind __has_builtin, because __has_builtin will return false given the builtin is only enabled on the aux triple despite the builtin actually being enabled.

This is best exemplified with the __cpuidex conflict test.

Fixes #152558.


Full diff: https://github.com/llvm/llvm-project/pull/154217.diff

3 Files Affected:

  • (modified) clang/lib/Basic/Builtins.cpp (+2-1)
  • (modified) clang/lib/Headers/cpuid.h (-5)
  • (modified) clang/test/Headers/__cpuidex_conflict.c (+1-1)
diff --git a/clang/lib/Basic/Builtins.cpp b/clang/lib/Basic/Builtins.cpp
index 885abdc152e3a..8e38ec98277e7 100644
--- a/clang/lib/Basic/Builtins.cpp
+++ b/clang/lib/Basic/Builtins.cpp
@@ -225,7 +225,8 @@ void Builtin::Context::initializeBuiltins(IdentifierTable &Table,
     // Step #3: Register target-specific builtins for AuxTarget.
     for (const auto &Shard : AuxTargetShards)
       for (const auto &I : Shard.Infos) {
-        Table.get(I.getName(Shard)).setBuiltinID(ID);
+        if (builtinIsSupported(*Shard.Strings, I, LangOpts))
+          Table.get(I.getName(Shard)).setBuiltinID(ID);
         ++ID;
       }
   }
diff --git a/clang/lib/Headers/cpuid.h b/clang/lib/Headers/cpuid.h
index ce8c79e77dc18..52addb7bfa856 100644
--- a/clang/lib/Headers/cpuid.h
+++ b/clang/lib/Headers/cpuid.h
@@ -345,15 +345,10 @@ static __inline int __get_cpuid_count (unsigned int __leaf,
 // In some configurations, __cpuidex is defined as a builtin (primarily
 // -fms-extensions) which will conflict with the __cpuidex definition below.
 #if !(__has_builtin(__cpuidex))
-// In some cases, offloading will set the host as the aux triple and define the
-// builtin. Given __has_builtin does not detect builtins on aux triples, we need
-// to explicitly check for some offloading cases.
-#ifndef __NVPTX__
 static __inline void __cpuidex(int __cpu_info[4], int __leaf, int __subleaf) {
   __cpuid_count(__leaf, __subleaf, __cpu_info[0], __cpu_info[1], __cpu_info[2],
                 __cpu_info[3]);
 }
 #endif
-#endif
 
 #endif /* __CPUID_H */
diff --git a/clang/test/Headers/__cpuidex_conflict.c b/clang/test/Headers/__cpuidex_conflict.c
index d14ef293e586d..e911036a1d953 100644
--- a/clang/test/Headers/__cpuidex_conflict.c
+++ b/clang/test/Headers/__cpuidex_conflict.c
@@ -5,7 +5,7 @@
 
 // Ensure that we do not run into conflicts when offloading.
 // RUN: %clang_cc1 %s -DIS_STATIC=static -ffreestanding -fopenmp -fopenmp-is-target-device -aux-triple x86_64-unknown-linux-gnu
-// RUN: %clang_cc1 -DIS_STATIC="" -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -internal-isystem /home/gha/llvm-project/build/lib/clang/22/include -x cuda %s -o -
+// RUN: %clang_cc1 -DIS_STATIC=static -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -x cuda %s -o -
 
 typedef __SIZE_TYPE__ size_t;
 

@llvmbot
Copy link
Member

llvmbot commented Aug 18, 2025

@llvm/pr-subscribers-clang

Author: Aiden Grossman (boomanaiden154)

Changes

This patch makes it so that builtins for the aux triple only get enabled if they are marked as supported by the current language options. Otherwise we define them regardless of the language mode. This can cause issues in certain scenarios, like if someone has a definition that conflicts with a builtin gated behind __has_builtin, because __has_builtin will return false given the builtin is only enabled on the aux triple despite the builtin actually being enabled.

This is best exemplified with the __cpuidex conflict test.

Fixes #152558.


Full diff: https://github.com/llvm/llvm-project/pull/154217.diff

3 Files Affected:

  • (modified) clang/lib/Basic/Builtins.cpp (+2-1)
  • (modified) clang/lib/Headers/cpuid.h (-5)
  • (modified) clang/test/Headers/__cpuidex_conflict.c (+1-1)
diff --git a/clang/lib/Basic/Builtins.cpp b/clang/lib/Basic/Builtins.cpp
index 885abdc152e3a..8e38ec98277e7 100644
--- a/clang/lib/Basic/Builtins.cpp
+++ b/clang/lib/Basic/Builtins.cpp
@@ -225,7 +225,8 @@ void Builtin::Context::initializeBuiltins(IdentifierTable &Table,
     // Step #3: Register target-specific builtins for AuxTarget.
     for (const auto &Shard : AuxTargetShards)
       for (const auto &I : Shard.Infos) {
-        Table.get(I.getName(Shard)).setBuiltinID(ID);
+        if (builtinIsSupported(*Shard.Strings, I, LangOpts))
+          Table.get(I.getName(Shard)).setBuiltinID(ID);
         ++ID;
       }
   }
diff --git a/clang/lib/Headers/cpuid.h b/clang/lib/Headers/cpuid.h
index ce8c79e77dc18..52addb7bfa856 100644
--- a/clang/lib/Headers/cpuid.h
+++ b/clang/lib/Headers/cpuid.h
@@ -345,15 +345,10 @@ static __inline int __get_cpuid_count (unsigned int __leaf,
 // In some configurations, __cpuidex is defined as a builtin (primarily
 // -fms-extensions) which will conflict with the __cpuidex definition below.
 #if !(__has_builtin(__cpuidex))
-// In some cases, offloading will set the host as the aux triple and define the
-// builtin. Given __has_builtin does not detect builtins on aux triples, we need
-// to explicitly check for some offloading cases.
-#ifndef __NVPTX__
 static __inline void __cpuidex(int __cpu_info[4], int __leaf, int __subleaf) {
   __cpuid_count(__leaf, __subleaf, __cpu_info[0], __cpu_info[1], __cpu_info[2],
                 __cpu_info[3]);
 }
 #endif
-#endif
 
 #endif /* __CPUID_H */
diff --git a/clang/test/Headers/__cpuidex_conflict.c b/clang/test/Headers/__cpuidex_conflict.c
index d14ef293e586d..e911036a1d953 100644
--- a/clang/test/Headers/__cpuidex_conflict.c
+++ b/clang/test/Headers/__cpuidex_conflict.c
@@ -5,7 +5,7 @@
 
 // Ensure that we do not run into conflicts when offloading.
 // RUN: %clang_cc1 %s -DIS_STATIC=static -ffreestanding -fopenmp -fopenmp-is-target-device -aux-triple x86_64-unknown-linux-gnu
-// RUN: %clang_cc1 -DIS_STATIC="" -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -internal-isystem /home/gha/llvm-project/build/lib/clang/22/include -x cuda %s -o -
+// RUN: %clang_cc1 -DIS_STATIC=static -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -x cuda %s -o -
 
 typedef __SIZE_TYPE__ size_t;
 

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch makes it so that builtins for the aux triple only get enabled if they are marked as supported by the current language options.

I'm not convinced that it's the right fix for the problem.

  • __cpuidex, uses Languages = "ALL_MS_LANGUAGES" which expands to C_LANG | CXX_LANG | OBJC_LANG | MS_LANG
  • the test targets linux, so LangOpts.MicrosoftExt is false.
  • for __cpuidex or any other MS-specific builtin it would cause builtinIsSupported to return false, and it would make it unavailable. So far so good. That solves part of the problem.

However, the reason we do have those aux builtins visible is that we still need to parse the same headers that the host side will see. And those headers may assume availability of those builtins.
So, while selectively disabling non-MS builtins during GPU compilation may solve this particular issue, it will likely create additional problems if/when the headers will need those now-filtered-out builtins from the aux triple.

It may be OK to selectively filter out specific builtins with very narrow use cases, like __cpuidex, but I think wholesale disabling of the platform-specific subset of builtins is not going to work.

@boomanaiden154
Copy link
Contributor Author

However, the reason we do have those aux builtins visible is that we still need to parse the same headers that the host side will see. And those headers may assume availability of those builtins.

So the language options are not guaranteed to be the same between the host and device compilations? Playing around with the individual clang cc1 invocations, it seems like they are not. Thanks for pointing that out and sorry for taking up your time reviewing this.

It may be OK to selectively filter out specific builtins with very narrow use cases, like __cpuidex, but I think wholesale disabling of the platform-specific subset of builtins is not going to work.

I'm not sure that's better than doing specific carve outs in cpuid.h. It sounds like really what we need here then is something like a __aux_target_has_builtin, but I'm not sure the complexity of implementing that makes sense here. I guess we can just keep things as is for now. If we run into issues with other offloading setups and we need something more principled, we can revisit.

boomanaiden154 added a commit that referenced this pull request Aug 22, 2025
This was intended to be fixed in #154217, but given that didn't land, it
still needs to be done. I think it still makes sense to have this change
in.
@Artem-B
Copy link
Member

Artem-B commented Aug 28, 2025

So the language options are not guaranteed to be the same between the host and device compilations?

I think it's more of the case that builtins are rarely tagged as language-specific, so, as implemented, this filtering is a very blunt hammer, even if we do consistently filter out builtins the same way in both host and device compilations. Given that Languages is also contaminated with non-language things like MS_LANG, it's just not quite suitable for making the call whether some builtin should be visible.

It sounds like really what we need here then is something like a __aux_target_has_builtin

It could have some uses, but a bigger question would be -- how far would we want to propagate it into headers that are otherwise agnostic of heterogeneous compilations. If we do want to make headers reliably interoperable between host/device, it's a much bigger problem, considering that there are multiple ways of doing it, each with their own quirks. For now we've been getting by with a case by case workarounds. I'm doubtful that we'll ever have a universal solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Clang] __cpuidex probably should not be enabled for nvptx triples

3 participants