Skip to content

Conversation

@markdryan
Copy link
Contributor

There were a couple of issues with the detection code used to check for RVV 1.0 on kernels that do not support hwprobe.

  1. The vtype clobber was missing
  2. The wrong form of vsetvli was being used. The vsetvli x0, x0 form is inappropriate for this use case as it can only be safely used in code where the value of vtype is known. The use of vsetvli x0, x0 here can lead to a failure to detect RVV 1.0, if, for example, the vill bit happens to be set before detect_riscv64_rvv100 is called.

We fix both issues by adding the missing clobber and replacing the first parameter to vsetvli with t0 (which we add to our clobbers).

There were a couple of issues with the detection code used to check
for RVV 1.0 on kernels that do not support hwprobe.

1. The vtype clobber was missing
2. The wrong form of vsetvli was being used. The vsetvli x0, x0 form
   is inappropriate for this use case as it can only be safely used
   in code where the value of vtype is known.  The use of vsetvli
   x0, x0 here can lead to a failure to detect RVV 1.0, if,
   for example, the vill bit happens to be set before
   detect_riscv64_rvv100 is called.

We fix both issues by adding the missing clobber and replacing the
first parameter to vsetvli with t0 (which we add to our clobbers).
@markdryan
Copy link
Contributor Author

I discovered this when working on #5431. When testing that PR I noticed that the RVV kernels were not always enabled on devices that supported RVV 1.0 but which were running kernels without hwprobe.

@martin-frbg
Copy link
Collaborator

The failing DYNAMIC_ARCH job has

   71 |                      :"t0", "vtype");

probably because it is running a riscv-gnu-clang toolchain from a year ago. This probably explains the previously missing clobber - I'm not sure if we should simply require users to have a newer one ?

martin-frbg added a commit to martin-frbg/OpenBLAS that referenced this pull request Sep 5, 2025
@martin-frbg martin-frbg added this to the 0.3.31 milestone Sep 5, 2025
@martin-frbg martin-frbg merged commit e2f9f57 into OpenMathLib:develop Sep 5, 2025
81 of 88 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants