Skip to content

Conversation

@tustvold
Copy link
Contributor

Which issue does this PR close?

Closes #8540

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@tustvold tustvold added the api change Changes the API exposed to users of the crate label Dec 17, 2023
@github-actions github-actions bot added the core Core DataFusion crate label Dec 17, 2023
}

#[tokio::test]
async fn test_with_lost_ordering_unbounded() -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't see a reason to keep these as they effectively duplicate the above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests makes sure that when sources are unbounded, we try to preserve existing ordering as much as possible. I will re-add corresponding tests that enforces this functionality. For this PR, I just ignore these tests.

" CsvExec: file_groups={1 group: [[file_path]]}, projection=[a, c, d], output_ordering=[a@0 ASC NULLS LAST], has_header=true",
];
assert_optimized!(expected_input, expected_optimized, physical_plan);
assert_optimized!(expected_input, expected_optimized, physical_plan, true);
Copy link
Contributor Author

@tustvold tustvold Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8572 tracks making this the default behaviour

@github-actions github-actions bot added the substrait Changes to the substrait crate label Dec 17, 2023
Copy link
Contributor

@metesynnada metesynnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Just a minor issue on tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think test logic may depend on infinite sources. So, we can use an unbounded executor like StreamingTableExec to preserve the test logic.

Copy link
Contributor Author

@tustvold tustvold Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs either infinite sources or prefer_existing_sort, I switched it to the latter

.await?;
let path = format!("{testdata}/csv/aggregate_test_100.csv");

match infinite {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!. Thanks @tustvold for this cleanup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate core Core DataFusion crate substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove ListingTable and FileScanConfig Unbounded

3 participants