-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
UPDATE June 2024: DataFusion does not use Morsel Driven Parallelism, instead it uses volcano pull + exchange style execution
You can read more about details and analysis in https://dl.acm.org/doi/10.1145/3626246.3653368
A proposal for reformulating the parallelism story within DataFusion to use a morsel-driven approach based on rayon. More details, background, and discussion can be found in the proposal document here, please feel free to comment there.
The keys highlights are:
- Decouples parallelism from the partitioning expressed in the physical plan, allowing for:
- Better handling of imbalanced partitions
- Adaptive parallelism based on compute availability at execution time
- Parallelism within a partition, such as decoding parquet columns in parallel, parallel sort, etc...
- The first step to reducing the complexity associated with the current futures-based concurrency model
- Improvements to thread-locality, observability and performance
xudong963, Dandandan, yjshen, yordan-pavlov, mingmwang and 3 more
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request