This project is an experiment aimed at pushing the limits of multicore processing in R by integrating Rust’s high-performance, multi-threaded capabilities. The idea is to bypass some of R’s inherent single-threaded constraints by using unsafe Rust code—what we call the "dirty parallel" approach—to achieve significant speedups.
- build a rust implementation of
lapply()
- take varargs
- assert all args are named
- demand all args passed in by name
- permit arbitrary argument names
- implement a dirty parallel version of
parallel::mclapply
- Use shared memory access
- ?Disable R's garbage collector during execution
- memory watchdog using sysinfo
- Abort threads on memory threshold
- ?Expose memory threshold and interval as parameters from R
- ?Ensure threads stop cleanly after an abort signal
- ?Add a retry loop for failed threads
- Return partial results with a retry status (e.g., "RETRY FAILED")
- ?Add exponential backoff or a cooldown delay between retries
- Create a
safe_for_parallel()
validator fail early if a function is deemed unsafe- Must not use shared memory
- Must not use unsafe globals (e.g.,
<<-
,assign
) - Must not call known side-effect functions
- Use {lintr} to statically inspect functions
- Benchmark
raw_dirty_lapply()
vs.mclapply()
- Stress test with simulated GC and side-effect functions