Skip to content

Conversation

@preames
Copy link
Collaborator

@preames preames commented Jul 19, 2024

This implements the __builtin_cpu_init and __builtin_cpu_supports builtin routines based on the compiler runtime changes in #85790.

This is inspired by #85786. Major changes are a) a restriction in scope to only the builtins (which have a much narrower user interface), and the avoidance of false generality. This change deliberately only handles group 0 extensions (which happen to be all defined ones today), and avoids the tblgen changes from that review.

I don't have an environment in which I can actually test this, but @BeMg has been kind enough to report that this appears to work as expected.

Before this can make it into a release, we need a change such as #99958. The gcc docs claim that cpu_support can be called by "normal" code without calling the cpu_init routine because the init routine will have been called by a high priority constructor. Our current compiler-rt mechanism does not do this.

This implements the __builtin_cpu_init and __builtin_cpu_supports
builtin routines based on the compiler runtime changes in llvm#85790.

This is inspired by llvm#85786.
Major changes are a) a restriction in scope to only the builtins
(which have a much narrower user interface), and the avoidance of
false generality.  This change deliberately only handles group 0
extensions (which happen to be all defined ones today), and avoids
the tblgen changes from that review.

This is still a WIP.  It is posted for initial feedback on whether
this makes sense to try to get into 19.x release. Major items left
undone:

* Updating clang tests to exercise this logic.
* Actually running it at all.  I did not build compiler-rt, and thus
  all my checking was of generated asm/IR.
* Investigate claims from gcc docs that __builtin_cpu_init is called
  early in process lifetime with high priority constructor.  I did
  not find this with some quick searching.
@preames preames requested review from BeMg, kito-cheng and topperc July 19, 2024 20:27
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jul 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-clang-codegen

Author: Philip Reames (preames)

Changes

This implements the __builtin_cpu_init and __builtin_cpu_supports builtin routines based on the compiler runtime changes in #85790.

This is inspired by #85786. Major changes are a) a restriction in scope to only the builtins (which have a much narrower user interface), and the avoidance of false generality. This change deliberately only handles group 0 extensions (which happen to be all defined ones today), and avoids the tblgen changes from that review.

This is still a WIP. It is posted for initial feedback on whether this makes sense to try to get into 19.x release. Major items left undone:

  • Updating clang tests to exercise this logic.
  • Actually running it at all. I did not build compiler-rt, and thus all my checking was of generated asm/IR.
  • Investigate claims from gcc docs that __builtin_cpu_init is called early in process lifetime with high priority constructor. I did not find this with some quick searching.

Full diff: https://github.com/llvm/llvm-project/pull/99700.diff

6 Files Affected:

  • (modified) clang/lib/Basic/Targets/RISCV.cpp (+6)
  • (modified) clang/lib/Basic/Targets/RISCV.h (+4)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+55)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+3)
  • (modified) llvm/include/llvm/TargetParser/RISCVISAInfo.h (+4)
  • (modified) llvm/lib/TargetParser/RISCVISAInfo.cpp (+61)
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 9159162f01d1b..41d836330b38c 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -479,3 +479,9 @@ RISCVTargetInfo::checkCallingConvention(CallingConv CC) const {
     return CCCR_OK;
   }
 }
+
+bool RISCVTargetInfo::validateCpuSupports(StringRef Feature) const {
+  // Only allow extensions we have a known bit position for in the
+  // __riscv_feature_bits structure.
+  return -1 != llvm::RISCVISAInfo::getRISCVFeaturesBitPosition(Feature);
+}
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index d5df6344bedc0..626274b8fc437 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -126,6 +126,10 @@ class RISCVTargetInfo : public TargetInfo {
   std::pair<unsigned, unsigned> hardwareInterferenceSizes() const override {
     return std::make_pair(32, 32);
   }
+
+  bool supportsCpuSupports() const override { return getTriple().isOSLinux(); }
+  bool supportsCpuInit() const override { return getTriple().isOSLinux(); }
+  bool validateCpuSupports(StringRef Feature) const override;
 };
 class LLVM_LIBRARY_VISIBILITY RISCV32TargetInfo : public RISCVTargetInfo {
 public:
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2ad62d6ee0bb2..71c947776adf2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -62,6 +62,8 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/ScopedPrinter.h"
 #include "llvm/TargetParser/AArch64TargetParser.h"
+#include "llvm/TargetParser/RISCVISAInfo.h"
+#include "llvm/TargetParser/RISCVTargetParser.h"
 #include "llvm/TargetParser/X86TargetParser.h"
 #include <optional>
 #include <sstream>
@@ -14215,6 +14217,16 @@ Value *CodeGenFunction::EmitAArch64CpuInit() {
   return Builder.CreateCall(Func);
 }
 
+Value *CodeGenFunction::EmitRISCVCpuInit() {
+  llvm::FunctionType *FTy = llvm::FunctionType::get(VoidTy, false);
+  llvm::FunctionCallee Func =
+      CGM.CreateRuntimeFunction(FTy, "__init_riscv_feature_bits");
+  auto *CalleeGV = cast<llvm::GlobalValue>(Func.getCallee());
+  CalleeGV->setDSOLocal(true);
+  CalleeGV->setDLLStorageClass(llvm::GlobalValue::DefaultStorageClass);
+  return Builder.CreateCall(Func);
+}
+
 Value *CodeGenFunction::EmitX86CpuInit() {
   llvm::FunctionType *FTy = llvm::FunctionType::get(VoidTy,
                                                     /*Variadic*/ false);
@@ -14267,6 +14279,43 @@ CodeGenFunction::EmitAArch64CpuSupports(ArrayRef<StringRef> FeaturesStrs) {
   return Result;
 }
 
+Value *CodeGenFunction::EmitRISCVCpuSupports(const CallExpr *E) {
+
+  const Expr *FeatureExpr = E->getArg(0)->IgnoreParenCasts();
+  StringRef FeatureStr = cast<StringLiteral>(FeatureExpr)->getString();
+  if (!getContext().getTargetInfo().validateCpuSupports(FeatureStr))
+    return Builder.getFalse();
+
+  // Note: We are making an unchecked assumption that the size of the
+  // feature array is >= 1.  This holds for any version of compiler-rt
+  // which defines this interface.
+  llvm::ArrayType *ArrayOfInt64Ty =
+      llvm::ArrayType::get(Int64Ty, 1);
+  llvm::Type *StructTy = llvm::StructType::get(Int32Ty, ArrayOfInt64Ty);
+  llvm::Constant *RISCVFeaturesBits =
+      CGM.CreateRuntimeVariable(StructTy, "__riscv_feature_bits");
+  auto *GV = cast<llvm::GlobalValue>(RISCVFeaturesBits);
+  GV->setDSOLocal(true);
+
+  auto LoadFeatureBit = [&](unsigned Index) {
+    // Create GEP then load.
+    Value *IndexVal = llvm::ConstantInt::get(Int32Ty, Index);
+    llvm::Value *GEPIndices[] = {Builder.getInt32(0), Builder.getInt32(1),
+                                 IndexVal};
+    Value *Ptr =
+        Builder.CreateInBoundsGEP(StructTy, RISCVFeaturesBits, GEPIndices);
+    Value *FeaturesBit =
+        Builder.CreateAlignedLoad(Int64Ty, Ptr, CharUnits::fromQuantity(8));
+    return FeaturesBit;
+  };
+
+  int BitPos = RISCVISAInfo::getRISCVFeaturesBitPosition(FeatureStr);
+  assert(BitPos != -1 && "validation should have rejected this feature");
+  Value *MaskV = Builder.getInt64(1ULL << BitPos);
+  Value *Bitset = Builder.CreateAnd(LoadFeatureBit(0), MaskV);
+  return Builder.CreateICmpEQ(Bitset, MaskV);
+}
+
 Value *CodeGenFunction::EmitX86BuiltinExpr(unsigned BuiltinID,
                                            const CallExpr *E) {
   if (BuiltinID == Builtin::BI__builtin_cpu_is)
@@ -21728,6 +21777,12 @@ Value *CodeGenFunction::EmitHexagonBuiltinExpr(unsigned BuiltinID,
 Value *CodeGenFunction::EmitRISCVBuiltinExpr(unsigned BuiltinID,
                                              const CallExpr *E,
                                              ReturnValueSlot ReturnValue) {
+
+  if (BuiltinID == Builtin::BI__builtin_cpu_supports)
+    return EmitRISCVCpuSupports(E);
+  if (BuiltinID == Builtin::BI__builtin_cpu_init)
+    return EmitRISCVCpuInit();
+
   SmallVector<Value *, 4> Ops;
   llvm::Type *ResultType = ConvertType(E->getType());
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index d83e38cab8e2d..1ee45a1f1f99d 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -4695,6 +4695,9 @@ class CodeGenFunction : public CodeGenTypeCache {
   llvm::Value *EmitRISCVBuiltinExpr(unsigned BuiltinID, const CallExpr *E,
                                     ReturnValueSlot ReturnValue);
 
+  llvm::Value *EmitRISCVCpuSupports(const CallExpr *E);
+  llvm::Value *EmitRISCVCpuInit();
+
   void AddAMDGPUFenceAddressSpaceMMRA(llvm::Instruction *Inst,
                                       const CallExpr *E);
   void ProcessOrderScopeAMDGCN(llvm::Value *Order, llvm::Value *Scope,
diff --git a/llvm/include/llvm/TargetParser/RISCVISAInfo.h b/llvm/include/llvm/TargetParser/RISCVISAInfo.h
index d7a08013fa6ac..d71ff174bf0d2 100644
--- a/llvm/include/llvm/TargetParser/RISCVISAInfo.h
+++ b/llvm/include/llvm/TargetParser/RISCVISAInfo.h
@@ -80,6 +80,10 @@ class RISCVISAInfo {
                                      std::set<StringRef> &EnabledFeatureNames,
                                      StringMap<StringRef> &DescMap);
 
+  /// Return the bit position (in group 0) of __riscv_feature_bits.  Returns
+  /// -1 if not supported.
+  static int getRISCVFeaturesBitPosition(StringRef Ext);
+
 private:
   RISCVISAInfo(unsigned XLen) : XLen(XLen) {}
 
diff --git a/llvm/lib/TargetParser/RISCVISAInfo.cpp b/llvm/lib/TargetParser/RISCVISAInfo.cpp
index e6ddbb4dc28d5..3ff1b325793e6 100644
--- a/llvm/lib/TargetParser/RISCVISAInfo.cpp
+++ b/llvm/lib/TargetParser/RISCVISAInfo.cpp
@@ -1020,3 +1020,64 @@ std::string RISCVISAInfo::getTargetFeatureForExtension(StringRef Ext) {
   return isExperimentalExtension(Name) ? "experimental-" + Name.str()
                                        : Name.str();
 }
+
+struct RISCVExtBit {
+  const StringRef ext;
+  uint64_t bitpos;
+};
+
+/// Maps extensions with assigned bit positions within group 0 of
+/// __riscv_features_bits to their respective bit position.  At the
+/// moment all extensions are within group 0.
+static RISCVExtBit RISCVGroup0BitPositions[] = {
+  {"a", 0},
+  {"c", 2},
+  {"d", 3},
+  {"f", 5},
+  {"i", 8},
+  {"m", 12},
+  {"v", 21},
+  {"zacas", 26},
+  {"zba", 27},
+  {"zbb", 28},
+  {"zbc", 29},
+  {"zbkb", 30},
+  {"zbkc", 31},
+  {"zbkx", 32},
+  {"zbs", 33},
+  {"zfa", 34},
+  {"zfh", 35},
+  {"zfhmin", 36},
+  {"zicboz", 37},
+  {"zicond", 38},
+  {"zihintntl", 39},
+  {"zihintpause", 40},
+  {"zknd", 41},
+  {"zkne", 42},
+  {"zknh", 43},
+  {"zksed", 44},
+  {"zksh", 45},
+  {"zkt", 46},
+  {"ztso", 47},
+  {"zvbb", 48},
+  {"zvbc", 49},
+  {"zvfh", 50},
+  {"zvfhmin", 51},
+  {"zvkb", 52},
+  {"zvkg", 53},
+  {"zvkned", 54},
+  {"zvknha", 55},
+  {"zvknhb", 56},
+  {"zvksed", 57},
+  {"zvksh", 58},
+  {"zvkt", 59}
+};
+int RISCVISAInfo::getRISCVFeaturesBitPosition(StringRef Ext) {
+  // Note that this code currently accepts mixed case extension names, but
+  // does not handle extension versions at all.  That's probably fine because
+  // there's only one extension version in the __riscv_feature_bits vector.
+  for (auto E : RISCVGroup0BitPositions)
+    if (E.ext.equals_insensitive(Ext))
+      return E.bitpos;
+  return -1;
+}

@github-actions
Copy link

github-actions bot commented Jul 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.


struct RISCVExtBit {
const StringRef ext;
uint64_t bitpos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need uint64_t for bit position? I think the value doesn't exceed 63.

#include "llvm/Support/ScopedPrinter.h"
#include "llvm/TargetParser/AArch64TargetParser.h"
#include "llvm/TargetParser/RISCVISAInfo.h"
#include "llvm/TargetParser/RISCVTargetParser.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RISCVTargetParser.h can be dropped due to its come from tablegen stuff.

@BeMg
Copy link
Contributor

BeMg commented Jul 20, 2024

  • Updating clang tests to exercise this logic.
  • Actually running it at all. I did not build compiler-rt, and thus all my checking was of generated asm/IR.

Maybe we could compile the test code with compiler-rt/lib/builtins/riscv/feature_bits.c manually. Then we could get rid of copmiler-rt building during running the __builtin_cpu_init and __builtin_cpu_supports actually.

Actually, I tried this patch in my local, and it work well.


Here is build script for compiler-rt, if you want to build compiler-rt.

cmake -DCMAKE_C_COMPILER_TARGET="riscv64-unknown-linux-gnu" \
      -DCMAKE_ASM_COMPILER_TARGET="riscv64-unknown-linux-gnu" \
      -DCMAKE_AR=./path/to/llvm/install/bin/llvm-ar \
      -DCMAKE_C_COMPILER=./path/to/llvm/install/bin/clang \
      -DCMAKE_NM=./path/to/llvm/install/bin/llvm-nm \
      -DCMAKE_RANLIB=./path/to/llvm/install/bin/llvm-ranlib \
      -DCOMPILER_RT_BUILD_BUILTINS=ON \
      -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
      -DCOMPILER_RT_BUILD_MEMPROF=OFF \
      -DCOMPILER_RT_BUILD_PROFILE=OFF \
      -DCOMPILER_RT_BUILD_SANITIZERS=OFF \
      -DCOMPILER_RT_BUILD_XRAY=OFF \
      -DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON \
      -DLLVM_CONFIG_PATH=./path/to/llvm/install/bin/llvm-config \
      -G "Ninja" ../compiler-rt
ninja 
# Put the `libclang_rt.builtins-riscv64.a` in correct path
cp ./path/to/build/lib/linux/* ./path/to/llvm/install/lib/clang/19/lib/linux
# Compile using compiler-rt
clang -march=rv64imafd -rtlib=compiler-rt test.c
  • Investigate claims from gcc docs that __builtin_cpu_init is called early in process lifetime with high priority constructor. I did not find this with some quick searching.

/// Maps extensions with assigned bit positions within group 0 of
/// __riscv_features_bits to their respective bit position. At the
/// moment all extensions are within group 0.
static RISCVExtBit RISCVGroup0BitPositions[] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason we choose the tablegen approach is trying to maintain the everything relate to extension (In this case is groupID/bitPos) inside RISCVFeatures.td.

How about land #94440 now, and replace this struct with tablegen generated .inc file? Or maybe we could replace it after 19.x release?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, adding the additional information to RISCVFeatures.td just complicates things.

@preames preames changed the title [WIP][RISCV] Support __builtin_cpu_init and __builtin_cpu_supports [RISCV] Support __builtin_cpu_init and __builtin_cpu_supports Jul 22, 2024
}

struct RISCVExtBit {
const StringRef ext;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use StringLiteral.

/// Maps extensions with assigned bit positions within group 0 of
/// __riscv_features_bits to their respective bit position. At the
/// moment all extensions are within group 0.
static RISCVExtBit RISCVGroup0BitPositions[] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this constexpr to make sure the strlen for the strings actually happens at compile time.

#endif

#if !__has_builtin(__builtin_cpu_supports)
# if defined(X86) || defined(RISCV)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ARM support __builtin_cpu_supports?

Copy link
Contributor

@BeMg BeMg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Work fine with current compiler-rt __init_riscv_feature_bits /__riscv_feature_bits and qemu with hwprobe support. (call __builtin_cpu_init manually situation)

@preames preames merged commit d1e28e2 into llvm:main Jul 23, 2024
@preames preames deleted the pr-riscv-builtin-cpu-supports branch July 23, 2024 15:48
@preames
Copy link
Collaborator Author

preames commented Jul 23, 2024

I've gone ahead and merged this into main. We have missed the branch creation, so without further action this will not be included in 19.x. We need to ensure the constructor change for compiler-rt lands, and then backport them together if we choose to.

yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
This implements the __builtin_cpu_init and __builtin_cpu_supports
builtin routines based on the compiler runtime changes in
#85790.

This is inspired by #85786.
Major changes are a) a restriction in scope to only the builtins (which
have a much narrower user interface), and the avoidance of false
generality. This change deliberately only handles group 0 extensions
(which happen to be all defined ones today), and avoids the tblgen
changes from that review.

I don't have an environment in which I can actually test this, but @BeMg
has been kind enough to report that this appears to work as expected.

Before this can make it into a release, we need a change such as
#99958. The gcc docs claim that
cpu_support can be called by "normal" code without calling the cpu_init
routine because the init routine will have been called by a high
priority constructor. Our current compiler-rt mechanism does not do
this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:RISC-V clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants