Skip to content

ROC kernel faulting upon having AMDGPU and CUDA loaded #362

@luraess

Description

@luraess

Executing a ROCKernel fails if both AMDGPU and CUDA are loaded. The order matters, if CUDA is loaded first, then no error occurs.

If using first AMDGPU, then CUDA:

julia> using AMDGPU

julia> import AMDGPU: GPUCompiler

julia> methods(GPUCompiler.runtime_module)
# 2 methods for generic function "runtime_module" from GPUCompiler:
 [1] runtime_module(::GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.ROCCompilerParams})
     @ AMDGPU.Compiler /scratch/project_465000139/lurass/julia_local/julia_depot/packages/AMDGPU/ia4rb/src/compiler.jl:59
 [2] runtime_module(job::GPUCompiler.CompilerJob)
     @ /scratch/project_465000139/lurass/julia_local/julia_depot/packages/GPUCompiler/07qaN/src/interface.jl:188

julia> nx=ny=11
11

julia> b = AMDGPU.ones(Float64, nx, ny);

julia> using CUDA

julia> methods(GPUCompiler.runtime_module)
# 3 methods for generic function "runtime_module" from GPUCompiler:
 [1] runtime_module(::GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.ROCCompilerParams})
     @ AMDGPU.Compiler /scratch/project_465000139/lurass/julia_local/julia_depot/packages/AMDGPU/ia4rb/src/compiler.jl:59
 [2] runtime_module(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams})
     @ CUDA /scratch/project_465000139/lurass/julia_local/julia_depot/packages/CUDA/DfvRa/src/compiler/gpucompiler.jl:58
 [3] runtime_module(job::GPUCompiler.CompilerJob)
     @ /scratch/project_465000139/lurass/julia_local/julia_depot/packages/GPUCompiler/07qaN/src/interface.jl:188

julia> c = AMDGPU.ones(Float64, nx, ny, 3)
ERROR: InvalidIRError: compiling kernel #5#6(AMDGPU.ROCKernelContext, ROCDeviceArray{Float64, 3, 1}, Float64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to __ockl_hsa_signal_cas)
Stacktrace:

Stack-trace here Stacktrace.txt

So it looks like using CUDA changes something in GPUCompiler that now breakes ROCKernel compilation/execution 👀 .

cc @vchuravy

Versions: AMDGPU#master, CUDA v3.12, Julia 1.9.0-DEV.1584

Closing JuliaGPU/AMDGPU.jl#312 in favour of this.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions