-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Speed up collect_bool and remove unsafe, optimize take_bits, take_native for null values
#8849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
unsafeunsafe, optimize take_bits a bit more
unsafe, optimize take_bits a bit morecollect_bool and remove unsafe, optimize take_bits a bit more
collect_bool and remove unsafe, optimize take_bits a bit morecollect_bool and remove unsafe, optimize take_bits, take_native a bit more for null values
collect_bool and remove unsafe, optimize take_bits, take_native a bit more for null valuescollect_bool and remove unsafe, optimize take_bits, take_native for null values
|
🤖 |
|
🤖: Benchmark completed Details
|
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Dandandan -- I think this looks great to me. The benchmarks look good and I reviewed the code carefully.
I left some other suggestions, but I don't think they are required
🚢 🇮🇹
| /// the option to efficiently negate the result | ||
| fn collect_bool(len: usize, neg: bool, f: impl Fn(usize) -> bool) -> BooleanBuffer { | ||
| let mut buffer = MutableBuffer::new(ceil(len, 64) * 8); | ||
| let mut buffer = Vec::with_capacity(ceil(len, 64)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could make the neg a generic argument on MutableBuffer::collect_bool and then avoid the duplication (as a follow on PR)
Or maybe make a collect_bool function in bit_util that returns a Vec and have the mutable buffer and this one call it 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, I was thinking about it as well.
| None => match n.is_null(idx) { | ||
| true => T::default(), | ||
| false => panic!("Out-of-bounds index {index:?}"), | ||
| // SAFETY: idx<indices.len() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to read this several times to convince myself it is correct -- namely that idx i doesn't come from indices (provided by the user) but instead comes from iterating indices
I found this whole method actually pretty confusing as there are multiple things called values and indices (and indices.values()..)
I also double checked that there is a test for out of bounds indexes here:
https://github.com/apache/arrow-rs/blob/f93da94e61e731344ce84146dee946a94fe36602/arrow-select/src/take.rs#L2084-L2083
| false => panic!("Out-of-bounds index {index:?}"), | ||
| // SAFETY: idx<indices.len() | ||
| None => match unsafe { n.inner().value_unchecked(idx) } { | ||
| false => T::default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dandandan do you know why this even bothers to look at the null buffer again? I realize you didn't change this code, but it seems to me like checking n.inner() (the nulls) is unecessary - it was already implicitly checked by calling values.get() (which returns Some/None).
It seems like all this is doing is re-checking that value() and the nulls match up.
so, TLDR I suggest we remove this clause entirely (could be a follow on PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can't remove it - it checks the indices value is null as well to make sure out of bounds on a non-null value leads to a panic.
So currently:
- out of bound with a null index value => default value (+ null in the output)
- out of bounds with a non-null value => panic
We could consider out of bounds either always panics or always gives a default (0) value but the current API (and tests) requires it to be this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks correct to me, but is not really intuitive. In our microbenchmarks the current code might faster because it avoids a branch on the validity bit in the happy case, but I'm not sure that will still be faster on larger inputs, or if a larger amount of indices is null.
I would find the following more intuitive, and hopefully not much slower (slice, range and zip iteration should all be TrustedLen):
indices
.values()
.iter()
.zip((0..n.len()).map(move |i| unsafe { n.inner().value_unchecked(i) }))
.map(|(index, valid)| if valid {
values[index.as_usize()]
} else {
T::default()
})
.collect()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a PR with this change:
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
This is the only one I think that could be a regression I think but AFAIK the take kernel benches were a bit noisy on the instance before as well. On my (apple m2) they are consistently somewhat better. |
given the overall time is measured in Maybe we should crank up the size of those batches to 8192 or something to better match typical usecases |
|
Looks good to me, nice performance improvement! |
# Which issue does this PR close? We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. - Follow on to #8849 # Rationale for this change While reviewing #8849 from @Dandandan I realized we don't have good examples for going from MutableBuffer to Vec, so it may not be clear to users how to use it. So let's add some examples in the docs for MutableBuffer # What changes are included in this PR? Add examples / update docs # Are these changes tested? By CI # Are there any user-facing changes? New docs, no functional changes --------- Co-authored-by: Ed Seidl <[email protected]>
Which issue does this PR close?
We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.
unsafe#8848Rationale for this change
unsafeWhy are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.
What changes are included in this PR?
Changes to use
VecAPI (extend, push) rather thanMutableBufferand unsafe code.Are these changes tested?
Existing tests
Are there any user-facing changes?
No