Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 10 additions & 5 deletions datafusion/execution/src/memory_pool/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
// specific language governing permissions and limitations
// under the License.

//! Manages all available memory during query execution
//! [`MemoryPool`] for memory management during query execution, [`proxy]` for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is proxy? I found the proxy.rs, is it correct?

//! help with allocation accounting.

use datafusion_common::Result;
use std::{cmp::Ordering, sync::Arc};
Expand Down Expand Up @@ -56,7 +57,7 @@ pub use pool::*;
/// kills the process, DataFusion `ExecutionPlan`s (operators) that consume
/// large amounts of memory must first request their desired allocation from a
/// [`MemoryPool`] before allocating more. The request is typically managed via
/// a [`MemoryReservation`].
/// a [`MemoryReservation`] and [`MemoryConsumer`].
///
/// If the allocation is successful, the operator should proceed and allocate
/// the desired memory. If the allocation fails, the operator must either first
Expand Down Expand Up @@ -107,9 +108,13 @@ pub trait MemoryPool: Send + Sync + std::fmt::Debug {
fn reserved(&self) -> usize;
}

/// A memory consumer that can be tracked by [`MemoryReservation`] in
/// a [`MemoryPool`]. All allocations are registered to a particular
/// `MemoryConsumer`;
/// A memory consumer is a named allocation traced by a particular
/// [`MemoryReservation`] in a [`MemoryPool`]. All allocations are registered to
/// a particular `MemoryConsumer`;
///
/// For help with allocation accounting, see the [proxy] module.
///
/// [proxy]: crate::memory_pool::proxy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see what is proxy now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also find proxy non intuitive -- the extension traits aren't really proxies for anything. Maybe we should move them to crate::memory_pool::extensions or crate::memory_pool::utils 🤔

I suppose at least this PR documents them which is an improvement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we once had an actual proxy object that could be used to represent the memory for another more complex data structure, but this design was since changed. So only the helper traits remain, which I think should be under utils or extensions. So feel free to move them.

#[derive(Debug)]
pub struct MemoryConsumer {
name: String,
Expand Down
23 changes: 16 additions & 7 deletions datafusion/execution/src/memory_pool/proxy.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
// specific language governing permissions and limitations
// under the License.

//! Utilities that help with tracking of memory allocations.
//! [`VecAllocExt`] and [`RawTableAllocExt`] to help tracking of memory allocations

use hashbrown::raw::{Bucket, RawTable};

Expand All @@ -24,12 +24,18 @@ pub trait VecAllocExt {
/// Item type.
type T;

/// [Push](Vec::push) new element to vector and store additional allocated bytes in `accounting` (additive).
/// [Push](Vec::push) new element to vector and increase
/// `accounting` by any newly allocated bytes.
///
/// Note that allocation counts capacity, not size
fn push_accounted(&mut self, x: Self::T, accounting: &mut usize);

/// Return the amount of memory allocated by this Vec (not
/// recursively counting any heap allocations contained within the
/// structure). Does not include the size of `self`
/// Return the amount of memory allocated by this Vec to store elements
/// (`size_of<T> * capacity`).
///
/// Note this calculation is not recursive, and does not include any heap
/// allocations contained within the Vec's elements. Does not include the
/// size of `self`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when reviewing this code, I note that datafusion has a custom growth strategy (2x) which is wasteful with large allocations -- I think we should perhaps go back to using the built in rust growth strategy. I'll file a PR / ticket with this idea later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #9057

fn allocated_size(&self) -> usize;
}

Expand All @@ -54,12 +60,15 @@ impl<T> VecAllocExt for Vec<T> {
}
}

/// Extension trait for [`RawTable`] to account for allocations.
/// Extension trait for hash browns [`RawTable`] to account for allocations.
pub trait RawTableAllocExt {
/// Item type.
type T;

/// [Insert](RawTable::insert) new element into table and store additional allocated bytes in `accounting` (additive).
/// [Insert](RawTable::insert) new element into table and increase
/// `accounting` by any newly allocated bytes.
///
/// Returns the bucket where the element was inserted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this design is "good enough", but we should probably add some examples at some point (doesn't have to be in this PR though). Same is actually true for the other trait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs + remove custom strategy: #9058

docs:#9059

fn insert_accounted(
&mut self,
x: Self::T,
Expand Down