Skip to content

Conversation

@wuxianxingkong
Copy link

What changes were proposed in this pull request?

This PR implements the SELECT INTO statement.

The SELECT INTO statement selects data from one table and inserts it into a new table as follows.

SELECT column_name(s)
INTO newtable
FROM table1;

This statement is commonly used in SQL but not currently supported in SparkSQL.
We investigated the Catalyst and found that this statement can be implemented by improving the grammar and reusing the logical plan of CTAS.

The related JIRA is https://issues.apache.org/jira/browse/SPARK-16217

How was this patch tested?

SQLQuerySuite.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

fromClause?
(WHERE where=booleanExpression)?)
| ((kind=SELECT setQuantifier? namedExpressionSeq fromClause?
| ((kind=SELECT setQuantifier? namedExpressionSeq (intoClause? fromClause)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @wuxianxingkong .
Currently, the following seems to be not considered yet. Could you modify the syntax to support this too?

SELECT 1
INTO newtable

Copy link
Author

@wuxianxingkong wuxianxingkong Jul 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, @dongjoon-hyun , thank you for your advice.

SELECT 1 
INTO newtable

This won't work because we need oldtable info to create newtable. So the sql should be

SELECT 1
INTO newtable 
FROM oldtable

The result from my test is: a new table called newtable was created, one column called 1 has the length of oldtable.rows.length and all elements are 1.
Did you mean there is no FROM?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Spark Shell, please run the followings.

sql("select 1")

Copy link
Author

@wuxianxingkong wuxianxingkong Jul 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun
At first, I modify grammar:
wrong select into
But it will affect multiInsertQueryBody rule, i.e.:

FROM OLD_TABLE
INSERT INTO T1
SELECT C1
INSERT INTO T2
SELECT C2

The Syntax tree before adding intoClause is:
right tree structure
After adding intoClause ,the tree will be:
wrong tree structure This is because INSERT is a nonreserved keyword and matching strategy of antlr.
One of the ways I can think of is to change grammar like this:
one way
This can solve the problem because antlr parser chooses the alternative specified first.
The grammar can support "SELECT 1 INTO newtable" now.
But this will cause confusion about querySpecification rule because of the duplication. Is there any way to make the syntax less verbose?Thanks.

@dongjoon-hyun
Copy link
Member

Hi, @wuxianxingkong .
Although I'm just a contributor like you, I left a few comments for you because I like your PR.
I hope your PR will be merged soon.

// Add organization statements.
optionalMap(ctx.queryOrganization)(withQueryResultClauses).
// Add insert.
optionalMap(ctx.insertInto())(withInsertInto)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows for the following syntax:

INSERT INTO tbl_a
SELECT *
INTO tbl_a
FROM tbl_b

Make sure that we cannot have both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to check what this does with multi-insert syntax, i.e.:

FROM tbl_a
INSERT INTO tbl_b
SELECT *
INSERT INTO tbl_c
SELECT *
INTO tbl_c

2.Add check in  multiinsertquery syntax:not allow multi insert and select into appear at the same time
3.Add check in singleinsertquery:not allow insert into and select into appear at the same time
*/
protected def withSelectInto(
ctx: IntoClauseContext,
query: LogicalPlan): LogicalPlan = withOrigin(ctx) {
Copy link
Member

@gatorsmile gatorsmile Jun 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why throwing a ParseException ?

@gatorsmile
Copy link
Member

@wuxianxingkong Are you still working on this? Thanks!

@gatorsmile
Copy link
Member

We are closing it due to inactivity. please do reopen if you want to push it forward. Thanks!

@asfgit asfgit closed this in b32bd00 Jun 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants