Skip to content

Conversation

@dtenedor
Copy link
Contributor

@dtenedor dtenedor commented Oct 5, 2024

What changes were proposed in this pull request?

This PR adds SQL pipe syntax support for the set operations: UNION, INTERSECT, EXCEPT, DISTINCT.

For example:

CREATE TABLE t(x INT, y STRING) USING CSV;
INSERT INTO t VALUES (0, 'abc'), (1, 'def');

TABLE t
|> UNION ALL (SELECT * FROM t);

0	abc
0	abc
1	def
1	def
1	NULL

Why are the changes needed?

The SQL pipe operator syntax will let users compose queries in a more flexible fashion.

Does this PR introduce any user-facing change?

Yes, see above.

How was this patch tested?

This PR adds a few unit test cases, but mostly relies on golden file test coverage. I did this to make sure the answers are correct as this feature is implemented and also so we can look at the analyzer output plans to ensure they look right as well.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Oct 5, 2024
@dtenedor dtenedor changed the title [WIP][SPARK-49564][SQL] Add SQL pipe syntax for set operations [SPARK-49564][SQL] Add SQL pipe syntax for set operations Oct 8, 2024
@dtenedor dtenedor marked this pull request as ready for review October 8, 2024 18:24
@dtenedor
Copy link
Contributor Author

dtenedor commented Oct 8, 2024

cc @cloud-fan @gengliangwang here is the support for UNION ALL and other set operations.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 135cbc6 Oct 9, 2024
@cloud-fan cloud-fan changed the title [SPARK-49564][SQL] Add SQL pipe syntax for set operations [SPARK-49559][SQL] Add SQL pipe syntax for set operations Oct 9, 2024
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?

This PR adds SQL pipe syntax support for the set operations: UNION, INTERSECT, EXCEPT, DISTINCT.

For example:

```
CREATE TABLE t(x INT, y STRING) USING CSV;
INSERT INTO t VALUES (0, 'abc'), (1, 'def');

TABLE t
|> UNION ALL (SELECT * FROM t);

0	abc
0	abc
1	def
1	def
1	NULL
```

### Why are the changes needed?

The SQL pipe operator syntax will let users compose queries in a more flexible fashion.

### Does this PR introduce _any_ user-facing change?

Yes, see above.

### How was this patch tested?

This PR adds a few unit test cases, but mostly relies on golden file test coverage. I did this to make sure the answers are correct as this feature is implemented and also so we can look at the analyzer output plans to ensure they look right as well.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#48359 from dtenedor/pipe-union.

Authored-by: Daniel Tenedorio <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants