When using a FFI for TVMBackendParallelLaunch, even heap allocating a single byte corrupts the resulting computation.
One possible cause is that there's some unintentional malloc/free happening when constructing the flambda closure.
Another (probably more likely) possibility is that I've incorrectly set a struct field wrong somewhere. Parallel for basic TVM ops works, after all.
For reference, Rust uses jemalloc.
Steps to reproduce
curl https://sh.rustup.rs -sSf | sh
rustup default nightly-2018-04-11
git clone https://github.com/nhynes/tvm-rs
cd tvm-rs/tests/test_nnvm
export TVM_NUM_THREADS=0
cargo run # works
cargo run --features par-launch-alloc # does not work