[CUSPARSE] missing broadcasts for multiplication

In the following 4 examples of sparse matrix and dense vector  broadcasted multiplications we have 2 errors and 2 wrong outputs (dense matrix instead of sparse)

```julia
julia> using SparseArrays, CUDA

julia> CUDA.allowscalar(false)

julia> A = sparse([1. 0 2; 0 3 4; 0 5 6]) |> cu
3×3 CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32} with 6 stored entries:
 1.0   ⋅   2.0
  ⋅   3.0  4.0
  ⋅   5.0  6.0

julia> x = [2.,3,4] |> cu
3-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
 2.0
 3.0
 4.0

julia> A .* x
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:87
  [3] getindex(xs::CuArray{Int32, 1, CUDA.Mem.DeviceBuffer}, I::Int64)
    @ GPUArrays ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:9
  [4] getindex(A::CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32}, i0::Int64, i1::Int64)
    @ CUDA.CUSPARSE ~/.julia/dev/CUDA/lib/cusparse/array.jl:311
  [5] _getindex
    @ ./abstractarray.jl:1291 [inlined]
  [6] getindex
    @ ./abstractarray.jl:1241 [inlined]
  [7] _broadcast_getindex
    @ ./broadcast.jl:623 [inlined]
  [8] _getindex
    @ ./broadcast.jl:666 [inlined]
  [9] _broadcast_getindex
    @ ./broadcast.jl:642 [inlined]
 [10] getindex
    @ ./broadcast.jl:597 [inlined]
 [11] macro expansion
    @ ./broadcast.jl:961 [inlined]
 [12] macro expansion
    @ ./simdloop.jl:77 [inlined]
 [13] copyto!
    @ ./broadcast.jl:960 [inlined]
 [14] copyto!
    @ ./broadcast.jl:913 [inlined]
 [15] copy
    @ ./broadcast.jl:885 [inlined]
 [16] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayConflict, Nothing, typeof(*), Tuple{CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}})
    @ Base.Broadcast ./broadcast.jl:860
 [17] top-level scope
    @ REPL[23]:1
 [18] top-level scope
    @ ~/.julia/dev/CUDA/src/initialization.jl:155

julia> x .* A
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:87
  [3] getindex
    @ ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:9 [inlined]
  [4] getindex
    @ ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:30 [inlined]
  [5] _broadcast_getindex
    @ ./broadcast.jl:636 [inlined]
  [6] _getindex
    @ ./broadcast.jl:666 [inlined]
  [7] _broadcast_getindex
    @ ./broadcast.jl:642 [inlined]
  [8] getindex
    @ ./broadcast.jl:597 [inlined]
  [9] macro expansion
    @ ./broadcast.jl:961 [inlined]
 [10] macro expansion
    @ ./simdloop.jl:77 [inlined]
 [11] copyto!
    @ ./broadcast.jl:960 [inlined]
 [12] copyto!
    @ ./broadcast.jl:913 [inlined]
 [13] copy
    @ ./broadcast.jl:885 [inlined]
 [14] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayConflict, Nothing, typeof(*), Tuple{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32}}})
    @ Base.Broadcast ./broadcast.jl:860
 [15] top-level scope
    @ REPL[26]:1
 [16] top-level scope
    @ ~/.julia/dev/CUDA/src/initialization.jl:155


julia> A .* x' # should give a CuSparseMatrix instead
3×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
 2.0   0.0   8.0
 0.0   9.0  16.0
 0.0  15.0  24.0

julia> x' .* A # should give a CuSparseMatrix instead
3×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
 2.0   0.0   8.0
 0.0   9.0  16.0
 0.0  15.0  24.0
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUSPARSE] missing broadcasts for multiplication #1666

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[CUSPARSE] missing broadcasts for multiplication #1666

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions