Skip to content

Conversation

@gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Mar 25, 2019

What changes were proposed in this pull request?

Migrate Text source to File Data Source V2

How was this patch tested?

Unit test

@SparkQA
Copy link

SparkQA commented Mar 25, 2019

Test build #103922 has finished for PR 24207 at commit 96664cc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class TextWriteBuilder(

@SparkQA
Copy link

SparkQA commented Mar 26, 2019

Test build #103943 has finished for PR 24207 at commit 520cb67.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Mar 26, 2019

Test build #103962 has finished for PR 24207 at commit 520cb67.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 27, 2019

Test build #104004 has finished for PR 24207 at commit 4343ab2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

val maxSplitBytes = FilePartition.maxSplitBytes(sparkSession, selectedPartitions)
val splitFiles = selectedPartitions.flatMap { partition =>
partition.files.flatMap { file =>
partition.files.filter(_.getLen > 0).flatMap { file =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to remove this.

Copy link
Member Author

@gengliangwang gengliangwang Mar 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGekk Thanks. This is code change from #24227 and I will remove it once #24227 is merged.

@gengliangwang gengliangwang changed the title [WIP][SPARK-27271][SQL] Migrate Text to File Data Source V2 [SPARK-27271][SQL] Migrate Text to File Data Source V2 Mar 27, 2019
@gengliangwang
Copy link
Member Author

@cloud-fan @HyukjinKwon @dongjoon-hyun This is ready for review.

@SparkQA
Copy link

SparkQA commented Mar 27, 2019

Test build #104025 has finished for PR 24207 at commit c74dddd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Looks good too. Let me take a closer look if that's not merged soon before getting it in.

@gengliangwang gengliangwang changed the title [SPARK-27271][SQL] Migrate Text to File Data Source V2 [WIP][SPARK-27271][SQL] Migrate Text to File Data Source V2 Mar 30, 2019
@gengliangwang gengliangwang changed the title [WIP][SPARK-27271][SQL] Migrate Text to File Data Source V2 [SPARK-27271][SQL] Migrate Text to File Data Source V2 Mar 31, 2019
@gengliangwang
Copy link
Member Author

@cloud-fan @HyukjinKwon @dongjoon-hyun This is ready for review again.

@dongjoon-hyun
Copy link
Member

Thank you for updating, @gengliangwang . I'll take a look today.

@SparkQA
Copy link

SparkQA commented Mar 31, 2019

Test build #104141 has finished for PR 24207 at commit 1ac450c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

@gengliangwang . The PR looks correct to me. I left only a few minor comments.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too

@gengliangwang
Copy link
Member Author

I am working on another PR about file source V2 framework. I will revisit this one later.

@HyukjinKwon HyukjinKwon changed the title [SPARK-27271][SQL] Migrate Text to File Data Source V2 [WIP][SPARK-27271][SQL] Migrate Text to File Data Source V2 Apr 2, 2019
@gengliangwang gengliangwang changed the title [WIP][SPARK-27271][SQL] Migrate Text to File Data Source V2 [SPARK-27271][SQL] Migrate Text to File Data Source V2 Apr 8, 2019
@SparkQA
Copy link

SparkQA commented Apr 8, 2019

Test build #104381 has finished for PR 24207 at commit 66b893d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

@dongjoon-hyun @HyukjinKwon @cloud-fan This PR is ready for review again.
Previously, this PR is was blocked by #24284 and #24296
Thanks in advance!

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.

Thank you, @gengliangwang , @HyukjinKwon , @MaxGekk .

dongjoon-hyun pushed a commit that referenced this pull request Apr 25, 2020
…FileSuite

### What changes were proposed in this pull request?

 Add V1/V2 tests for TextSuite and WholeTextFileSuite

### Why are the changes needed?

This is missing part since #24207. We should have these tests for test coverage.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests.

Closes #28335 from gengliangwang/testV2Suite.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Apr 25, 2020
…FileSuite

### What changes were proposed in this pull request?

 Add V1/V2 tests for TextSuite and WholeTextFileSuite

### Why are the changes needed?

This is missing part since #24207. We should have these tests for test coverage.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests.

Closes #28335 from gengliangwang/testV2Suite.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 16b9615)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants