is there a cost to cublaslt `retcode != 0`?

With [this](https://github.com/FluxML/OneHotArrays.jl/pull/56) PR to OneHotArrays, basic examples with Lux seem to be working with them, but one gets the warning
```
┌ Warning: cuBLASLt failed for the given inputs relu, CuArray{Float32, 2, CUDA.DeviceMemory} [(8, 4)], OneHotMatrix{UInt32, CuArray{UInt32, 1, CUDA.DeviceMemory}} [(4, 32)], CuArray{Float32, 1, CUDA.DeviceMemory} [8]. Falling back to generic implementation.
└ @ LuxLibCUDAExt ~/dev/Lux/lib/LuxLib/ext/LuxLibCUDAExt/cublaslt.jl:311
```
This occurs if the return code of `cublaslt_matmul_fused!` is non-zero.  Note that there is another branch in which `hasmethod` returns `false` and only a `@debug` is returned (as I recall I had asked you and you said it's fine if we made it a debug in that case).

This case seems different, because it is not fully eliding the call to cuBLAS, so I really have no idea if this is a bigger issue or again this should just be ignored.

I suggest one of the following:
- If there's no real issue here, this should just be `@debug`.
- If there's a real problem, we need some way for stuff to opt out of this so that it doesn't try to cal cuBLAS on stuff it doesn't work on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

is there a cost to cublaslt `retcode != 0`? #1323

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

is there a cost to cublaslt retcode != 0? #1323

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

is there a cost to cublaslt `retcode != 0`? #1323