Skip to content

Conversation

@capkurmagati
Copy link
Contributor

@capkurmagati capkurmagati commented Oct 29, 2021

Which issue does this PR close?

Closes #1196.

  1. Added match arms for Expr::Between, Expr::Sort, and Expr::Wildcard and removed default match in expr.rs.
  2. Added match arm for Expr::Between in planner.rs.
echo "1" > /tmp/foo.csv
cargo run -p datafusion-cli
CREATE EXTERNAL TABLE foo(x int)
STORED AS CSV
LOCATION '/tmp/foo.csv';
>  select 1 between 5 and 10;
+----------------+
| Boolean(false) |
+----------------+
| false          |
+----------------+
1 row in set. Query took 0.003 seconds.

> select 1 between 5 and 10 from foo;
+----------------+
| Boolean(false) |
+----------------+
| false          |
+----------------+
1 row in set. Query took 0.011 seconds.

> select x between 5 and 10 from foo;
+--------------------------------------+
| foo.x BETWEEN Int64(5) AND Int64(10) |
+--------------------------------------+
| false                                |
+--------------------------------------+
1 row in set. Query took 0.012 seconds.

> explain select x between 5 and 10 from foo;
+---------------+---------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                |
+---------------+---------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: #foo.x BETWEEN Int64(5) AND Int64(10)                                                                   |
|               |   TableScan: foo projection=Some([0])                                                                               |
| physical_plan | ProjectionExec: expr=[CAST(x@0 AS Int64) >= 5 AND CAST(x@0 AS Int64) <= 10 as foo.x BETWEEN Int64(5) AND Int64(10)] |
|               |   RepartitionExec: partitioning=RoundRobinBatch(8)                                                                  |
|               |     CsvExec: files=[/tmp/foo.csv], has_header=false, batch_size=8192, limit=None                                    |
|               |                                                                                                                     |
+---------------+---------------------------------------------------------------------------------------------------------------------+
2 rows in set. Query took 0.005 seconds.


@jimexist
Copy link
Member

thanks!

could you also add an integration test?

@capkurmagati
Copy link
Contributor Author

capkurmagati commented Oct 31, 2021

@jimexist Sure. Since I'm new to the project, let me drop some newbie questions.

  1. planner.rs seems have integration tests using arrow-datafusion/testing. Is that the test you want to add?
  2. I couldn't find tests for create_name(e: &Expr, input_schema: &DFSchema) -> Result<String> . Do you also want to add some tests for that?

@xudong963
Copy link
Member

  • planner.rs seems have integration tests using arrow-datafusion/testing. Is that the test you want to add?
  • I couldn't find tests for create_name(e: &Expr, input_schema: &DFSchema) -> Result<String> . Do you also want to add some tests for that?

I think you can add unit tests firstly in datafusion/tests/sql.rs where has tests about other exprs.

@alamb
Copy link
Contributor

alamb commented Nov 1, 2021

I think you can add unit tests firstly in datafusion/tests/sql.rs where has tests about other exprs.

I agree with @xudong963 that a test in sql.rs would be sufficient for this PR. Thank you for the contribution @capkurmagati

@capkurmagati
Copy link
Contributor Author

@xudong963 @alamb Thanks for the advice. I added some tests for the expr. PTAL.
I will rebase the branch after reflecting your reviews.

@xudong963
Copy link
Member

xudong963 commented Nov 3, 2021

The remaining LGTM, wait for @alamb to review. Nice work @capkurmagati

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @capkurmagati -- the tests look great.

@alamb
Copy link
Contributor

alamb commented Nov 3, 2021

There appears to be a conflict in sql.rs. I have merged up from main to resolve it

@alamb
Copy link
Contributor

alamb commented Nov 3, 2021

The test was failing due to the executor having 2 cores, but the output reflected 8 cores -- updated in 8eff3de

@capkurmagati capkurmagati force-pushed the issue/1196-select-between branch from 2ed58c3 to 15785ee Compare November 4, 2021 12:54
@capkurmagati
Copy link
Contributor Author

@xudong963 Removed the default match arm and added match arms for Expr::Sort, and Expr::Wildcard in create_physical_name just like create_name. PTAL.
@alamb Thanks so much for resolving code conflict and fixing the test. Much appreciated.

Ok(format!("{} BETWEEN {} AND {}", expr, low, high))
}
}
Expr::Sort {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly use Expr::Sort{ .. }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice. Fixed.

Ok(format!("{} BETWEEN {} AND {}", expr, low, high))
}
}
Expr::Sort {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@xudong963
Copy link
Member

LGTM, thanks @capkurmagati. Let @alamb do the final review and merging.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great -- thank you @capkurmagati -- I'll plan to merge this once the CI checks pass

@alamb alamb merged commit 8f1d533 into apache:master Nov 4, 2021
@alamb
Copy link
Contributor

alamb commented Nov 4, 2021

Thanks again @capkurmagati !

@houqp houqp added the enhancement New feature or request label Nov 5, 2021
@houqp houqp added the sql SQL Planner label Nov 5, 2021
unkloud pushed a commit to unkloud/datafusion that referenced this pull request Mar 23, 2025
unkloud pushed a commit to unkloud/datafusion that referenced this pull request Mar 23, 2025
* feat: add support for array_contains expression

* test: add unit test for array_contains function

* Removes unnecessary case expression for handling null values

* chore: Move more expressions from core crate to spark-expr crate (apache#1152)

* move aggregate expressions to spark-expr crate

* move more expressions

* move benchmark

* normalize_nan

* bitwise not

* comet scalar funcs

* update bench imports

* remove dead code (apache#1155)

* fix: Spark 4.0-preview1 SPARK-47120 (apache#1156)

## Which issue does this PR close?

Part of apache/datafusion-comet#372 and apache/datafusion-comet#551

## Rationale for this change

To be ready for Spark 4.0

## What changes are included in this PR?

This PR fixes the new test SPARK-47120 added in Spark 4.0

## How are these changes tested?

tests enabled

* chore: Move string kernels and expressions to spark-expr crate (apache#1164)

* Move string kernels and expressions to spark-expr crate

* remove unused hash kernel

* remove unused dependencies

* chore: Move remaining expressions to spark-expr crate + some minor refactoring (apache#1165)

* move CheckOverflow to spark-expr crate

* move NegativeExpr to spark-expr crate

* move UnboundColumn to spark-expr crate

* move ExpandExec from execution::datafusion::operators to execution::operators

* refactoring to remove datafusion subpackage

* update imports in benches

* fix

* fix

* chore: Add ignored tests for reading complex types from Parquet (apache#1167)

* Add ignored tests for reading structs from Parquet

* add basic map test

* add tests for Map and Array

* feat: Add Spark-compatible implementation of SchemaAdapterFactory (apache#1169)

* Add Spark-compatible SchemaAdapterFactory implementation

* remove prototype code

* fix

* refactor

* implement more cast logic

* implement more cast logic

* add basic test

* improve test

* cleanup

* fmt

* add support for casting unsigned int to signed int

* clippy

* address feedback

* fix test

* fix: Document enabling comet explain plan usage in Spark (4.0) (apache#1176)

* test: enabling Spark tests with offHeap requirement (apache#1177)

## Which issue does this PR close?

## Rationale for this change

After apache/datafusion-comet#1062 We have not running Spark tests for native execution

## What changes are included in this PR?

Removed the off heap requirement for testing

## How are these changes tested?

Bringing back Spark tests for native execution

* feat: Improve shuffle metrics (second attempt) (apache#1175)

* improve shuffle metrics

* docs

* more metrics

* refactor

* address feedback

* fix: stddev_pop should not directly return 0.0 when count is 1.0 (apache#1184)

* add test

* fix

* fix

* fix

* feat: Make native shuffle compression configurable and respect `spark.shuffle.compress` (apache#1185)

* Make shuffle compression codec and level configurable

* remove lz4 references

* docs

* update comment

* clippy

* fix benches

* clippy

* clippy

* disable test for miri

* remove lz4 reference from proto

* minor: move shuffle classes from common to spark (apache#1193)

* minor: refactor decodeBatches to make private in broadcast exchange (apache#1195)

* minor: refactor prepare_output so that it does not require an ExecutionContext (apache#1194)

* fix: fix missing explanation for then branch in case when (apache#1200)

* minor: remove unused source files (apache#1202)

* chore: Upgrade to DataFusion 44.0.0-rc2 (apache#1154)

* move aggregate expressions to spark-expr crate

* move more expressions

* move benchmark

* normalize_nan

* bitwise not

* comet scalar funcs

* update bench imports

* save

* save

* save

* remove unused imports

* clippy

* implement more hashers

* implement Hash and PartialEq

* implement Hash and PartialEq

* implement Hash and PartialEq

* benches

* fix ScalarUDFImpl.return_type failure

* exclude test from miri

* ignore correct test

* ignore another test

* remove miri checks

* use return_type_from_exprs

* Revert "use return_type_from_exprs"

This reverts commit febc1f1ec1301f9b359fc23ad6a117224fce35b7.

* use DF main branch

* hacky workaround for regression in ScalarUDFImpl.return_type

* fix repo url

* pin to revision

* bump to latest rev

* bump to latest DF rev

* bump DF to rev 9f530dd

* add Cargo.lock

* bump DF version

* no default features

* Revert "remove miri checks"

This reverts commit 4638fe3aa5501966cd5d8b53acf26c698b10b3c9.

* Update pin to DataFusion e99e02b

* update pin

* Update Cargo.toml

Bump to 44.0.0-rc2

* update cargo lock

* revert miri change

---------

Co-authored-by: Andrew Lamb <[email protected]>

* update UT

Signed-off-by: Dharan Aditya <[email protected]>

* fix typo in UT

Signed-off-by: Dharan Aditya <[email protected]>

---------

Signed-off-by: Dharan Aditya <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: KAZUYUKI TANIMURA <[email protected]>
Co-authored-by: Parth Chandra <[email protected]>
Co-authored-by: Liang-Chi Hsieh <[email protected]>
Co-authored-by: Raz Luvaton <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
H0TB0X420 pushed a commit to H0TB0X420/datafusion that referenced this pull request Oct 7, 2025
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](uuid-rs/uuid@v1.17.0...v1.18.0)

---
updated-dependencies:
- dependency-name: uuid
  dependency-version: 1.18.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can not use between in the select list:

5 participants