Skip to content

Conversation

@JasonLi-cn
Copy link
Contributor

@JasonLi-cn JasonLi-cn commented Aug 12, 2022

Which issue does this PR close?

Closes #3103 (comment)

Rationale for this change

Support Timestamp +/- Interval

What changes are included in this PR?

Update DateIntervalExpr‘ evaluate() function mainly:
https://github.com/apache/arrow-datafusion/blob/b80c85359bdfc6d483cb2f222a988a9125ed8670/datafusion/physical-expr/src/expressions/datetime.rs#L88-L160

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates labels Aug 12, 2022
@JasonLi-cn JasonLi-cn closed this Aug 12, 2022
@JasonLi-cn JasonLi-cn reopened this Aug 12, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #3110 (51888a1) into master (9956f80) will decrease coverage by 0.01%.
The diff coverage is 74.07%.

@@            Coverage Diff             @@
##           master    #3110      +/-   ##
==========================================
- Coverage   85.95%   85.93%   -0.02%     
==========================================
  Files         290      290              
  Lines       52260    52356      +96     
==========================================
+ Hits        44919    44993      +74     
- Misses       7341     7363      +22     
Impacted Files Coverage Δ
datafusion/physical-expr/src/planner.rs 92.36% <ø> (ø)
...tafusion/physical-expr/src/expressions/datetime.rs 82.20% <73.33%> (-4.60%) ⬇️
datafusion/expr/src/binary_rule.rs 84.70% <100.00%> (+0.08%) ⬆️
datafusion/optimizer/src/eliminate_limit.rs 100.00% <0.00%> (ø)
...atafusion/physical-expr/src/aggregate/array_agg.rs 95.91% <0.00%> (+0.04%) ⬆️
datafusion/expr/src/logical_plan/plan.rs 77.60% <0.00%> (+0.17%) ⬆️
datafusion/core/src/physical_plan/metrics/value.rs 87.43% <0.00%> (+0.50%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JasonLi-cn - this is a great contribution

The only thing I would like to see are some "integration" level SQL tests of this feature -- perhaps we could extend the test / add some new tests in https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/sql/timestamp.rs

cc @avantgardnerio who added the initial version of this functionality in #2797

@github-actions github-actions bot added the core Core DataFusion crate label Aug 15, 2022
Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits. The functional change looks great, ty!

) -> Result<ColumnarValue> {
let ret = match array.data_type() {
DataType::Date32 => {
let array = array.as_any().downcast_ref::<Date32Array>().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this shouldn't panic, but since this is returning a result anyway, would it improve auditability to use the error propagation operator here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this?

let array = array.as_any().downcast_ref::<Date32Array>().ok_or(
    DataFusionError::Execution("Cast Date32Array error".to_string()),
)?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit more verbose, but I think it makes it easier to tell that it will never panic. @andygrove @alamb do you have thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by this point in the plan it is ok to panic! as it indicates a bug in the code and I think having to always do the ceremony to check the type makes it less readable. Maybe we could make a function like as_date32_array that returned an error (similar to the arrow ones that panic?): https://docs.rs/arrow/20.0.0/arrow/array/index.html#functions ?

Sadly it appears we are inconsistent in the codebase with this regards: https://github.com/apache/arrow-datafusion/search?l=Rust&q=downcast 😢

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in other words, I think this PR is ok as written.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will file a ticket as this is confusing and it would be nice to get DataFusion consistent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #3152

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great @JasonLi-cn so I plan to merge it. There are some good potential follow ons (like upstreaming the date/time arithmetic to arrow as well as standardizing the handling of "what to do when downcasting and array fails") but I think they would be best done in a follow on PR.

) -> Result<ColumnarValue> {
let ret = match array.data_type() {
DataType::Date32 => {
let array = array.as_any().downcast_ref::<Date32Array>().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by this point in the plan it is ok to panic! as it indicates a bug in the code and I think having to always do the ceremony to check the type makes it less readable. Maybe we could make a function like as_date32_array that returned an error (similar to the arrow ones that panic?): https://docs.rs/arrow/20.0.0/arrow/array/index.html#functions ?

Sadly it appears we are inconsistent in the codebase with this regards: https://github.com/apache/arrow-datafusion/search?l=Rust&q=downcast 😢

) -> Result<ColumnarValue> {
let ret = match array.data_type() {
DataType::Date32 => {
let array = array.as_any().downcast_ref::<Date32Array>().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in other words, I think this PR is ok as written.

@alamb
Copy link
Contributor

alamb commented Aug 15, 2022

I plan to merge this PR in when the checks pass

@ursabot
Copy link

ursabot commented Aug 15, 2022

Benchmark runs are scheduled for baseline = 3666305 and contender = 6509d0d. 6509d0d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@JasonLi-cn JasonLi-cn deleted the feature/support_timestamp_plus_minus_interval branch August 16, 2022 02:06
@JasonLi-cn JasonLi-cn restored the feature/support_timestamp_plus_minus_interval branch August 16, 2022 03:56
@JasonLi-cn JasonLi-cn deleted the feature/support_timestamp_plus_minus_interval branch August 16, 2022 04:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support Timestamp +/- Interval

5 participants