-
Notifications
You must be signed in to change notification settings - Fork 254
Open
Labels
cuda librariesStuff about CUDA library wrappers.Stuff about CUDA library wrappers.enhancementNew feature or requestNew feature or request
Description
In the following 4 examples of sparse matrix and dense vector broadcasted multiplications we have 2 errors and 2 wrong outputs (dense matrix instead of sparse)
julia> using SparseArrays, CUDA
julia> CUDA.allowscalar(false)
julia> A = sparse([1. 0 2; 0 3 4; 0 5 6]) |> cu
3×3 CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32} with 6 stored entries:
1.0 ⋅ 2.0
⋅ 3.0 4.0
⋅ 5.0 6.0
julia> x = [2.,3,4] |> cu
3-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
2.0
3.0
4.0
julia> A .* x
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:87
[3] getindex(xs::CuArray{Int32, 1, CUDA.Mem.DeviceBuffer}, I::Int64)
@ GPUArrays ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:9
[4] getindex(A::CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32}, i0::Int64, i1::Int64)
@ CUDA.CUSPARSE ~/.julia/dev/CUDA/lib/cusparse/array.jl:311
[5] _getindex
@ ./abstractarray.jl:1291 [inlined]
[6] getindex
@ ./abstractarray.jl:1241 [inlined]
[7] _broadcast_getindex
@ ./broadcast.jl:623 [inlined]
[8] _getindex
@ ./broadcast.jl:666 [inlined]
[9] _broadcast_getindex
@ ./broadcast.jl:642 [inlined]
[10] getindex
@ ./broadcast.jl:597 [inlined]
[11] macro expansion
@ ./broadcast.jl:961 [inlined]
[12] macro expansion
@ ./simdloop.jl:77 [inlined]
[13] copyto!
@ ./broadcast.jl:960 [inlined]
[14] copyto!
@ ./broadcast.jl:913 [inlined]
[15] copy
@ ./broadcast.jl:885 [inlined]
[16] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayConflict, Nothing, typeof(*), Tuple{CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}})
@ Base.Broadcast ./broadcast.jl:860
[17] top-level scope
@ REPL[23]:1
[18] top-level scope
@ ~/.julia/dev/CUDA/src/initialization.jl:155
julia> x .* A
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:87
[3] getindex
@ ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:9 [inlined]
[4] getindex
@ ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:30 [inlined]
[5] _broadcast_getindex
@ ./broadcast.jl:636 [inlined]
[6] _getindex
@ ./broadcast.jl:666 [inlined]
[7] _broadcast_getindex
@ ./broadcast.jl:642 [inlined]
[8] getindex
@ ./broadcast.jl:597 [inlined]
[9] macro expansion
@ ./broadcast.jl:961 [inlined]
[10] macro expansion
@ ./simdloop.jl:77 [inlined]
[11] copyto!
@ ./broadcast.jl:960 [inlined]
[12] copyto!
@ ./broadcast.jl:913 [inlined]
[13] copy
@ ./broadcast.jl:885 [inlined]
[14] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayConflict, Nothing, typeof(*), Tuple{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float32, Int32}}})
@ Base.Broadcast ./broadcast.jl:860
[15] top-level scope
@ REPL[26]:1
[16] top-level scope
@ ~/.julia/dev/CUDA/src/initialization.jl:155
julia> A .* x' # should give a CuSparseMatrix instead
3×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
2.0 0.0 8.0
0.0 9.0 16.0
0.0 15.0 24.0
julia> x' .* A # should give a CuSparseMatrix instead
3×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
2.0 0.0 8.0
0.0 9.0 16.0
0.0 15.0 24.0
Metadata
Metadata
Assignees
Labels
cuda librariesStuff about CUDA library wrappers.Stuff about CUDA library wrappers.enhancementNew feature or requestNew feature or request