Skip to content

rocm backend: crash with LLVM 8 #4087

@t-vi

Description

@t-vi

On Debian (unstable), I have LLVM 8.0.1 (as default llvm) and LLVM 9.0.0 (with llvm-config-9).
Using the rocm backend, it seems that when I compile tvm master against 8.0.1, I get crashes (free(): invalid next size (normal), double free without debug build, see below). When I build with LLVM 9, this disappears.
I must admit that given that using newer LLVM seems like the future, my suggestion would be to explicitly require LLVM >= 9 for the AMDGPU backend. If that is agreeable, I'd be happy to send a PR.

Here is a stack trace, this is from running the reduction example:

#0  0x00007ffff7e02081 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7ded535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff7e43db8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007ffff7e4a48a in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007ffff7e4bdac in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fffe4b9a1ef in llvm::Module::~Module() () from /usr/lib/llvm-8/lib/libLLVM-8.so.1
#6  0x00007fffe9c675de in std::default_delete<llvm::Module>::operator() (this=0x1a65ef0, __ptr=0xde52d0) at /usr/include/c++/9/bits/unique_ptr.h:81
#7  0x00007fffe9c64016 in std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >::~unique_ptr (this=0x1a65ef0, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:284
#8  0x00007fffe9c74d0a in std::_Destroy<std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> > > (__pointer=0x1a65ef0) at /usr/include/c++/9/bits/stl_construct.h:98
#9  0x00007fffe9c71d18 in std::_Destroy_aux<false>::__destroy<std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >*> (__first=0x1a65ef0, __last=0x1a65ef8)
    at /usr/include/c++/9/bits/stl_construct.h:108
#10 0x00007fffe9c6dfaa in std::_Destroy<std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >*> (__first=0x1a65ef0, __last=0x1a65ef8) at /usr/include/c++/9/bits/stl_construct.h:137
#11 0x00007fffe9c677d9 in std::_Destroy<std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >*, std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> > > (__first=0x1a65ef0, 
    __last=0x1a65ef8) at /usr/include/c++/9/bits/stl_construct.h:206
#12 0x00007fffe9c64279 in std::vector<std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, std::allocator<std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> > > >::~vector (
    this=0x1ad90e8, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_vector.h:677
#13 0x00007fffe9c62e44 in tvm::codegen::CodeGenLLVM::~CodeGenLLVM (this=0x1ad9040, __in_chrg=<optimized out>) at /home/tv/pytorch/tvm/tvm/src/codegen/llvm/codegen_llvm.h:50
#14 0x00007fffe9c67b3c in tvm::codegen::CodeGenAMDGPU::~CodeGenAMDGPU (this=0x1ad9040, __in_chrg=<optimized out>) at /home/tv/pytorch/tvm/tvm/src/codegen/llvm/codegen_amdgpu.cc:40
#15 0x00007fffe9c67b5e in tvm::codegen::CodeGenAMDGPU::~CodeGenAMDGPU (this=0x1ad9040, __in_chrg=<optimized out>) at /home/tv/pytorch/tvm/tvm/src/codegen/llvm/codegen_amdgpu.cc:40
#16 0x00007fffe9c67b9a in std::default_delete<tvm::codegen::CodeGenAMDGPU>::operator() (this=0x7fffffffb960, __ptr=0x1ad9040) at /usr/include/c++/9/bits/unique_ptr.h:81
#17 0x00007fffe9c64508 in std::unique_ptr<tvm::codegen::CodeGenAMDGPU, std::default_delete<tvm::codegen::CodeGenAMDGPU> >::~unique_ptr (this=0x7fffffffb960, __in_chrg=<optimized out>)
    at /usr/include/c++/9/bits/unique_ptr.h:284
#18 0x00007fffe9c5fe0f in tvm::codegen::BuildAMDGPU (funcs=..., target="rocm") at /home/tv/pytorch/tvm/tvm/src/codegen/llvm/codegen_amdgpu.cc:195
#19 0x00007fffe942b8fa in tvm::runtime::detail::unpack_call_dispatcher<tvm::runtime::Module, 0, 2, tvm::runtime::Module (*)(tvm::Array<tvm::LoweredFunc, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)>::run<tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue> (
    f=@0x1a09ae0: 0x7fffe9c5e971 <tvm::codegen::BuildAMDGPU(tvm::Array<tvm::LoweredFunc, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)>, args_pack=..., 
    rv=0x7fffffffcbd0, unpacked_args#0=..., unpacked_args#1=...) at /home/tv/pytorch/tvm/tvm/include/tvm/runtime/packed_func.h:1225

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions