-
Notifications
You must be signed in to change notification settings - Fork 0
Parquet read failing test case #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet read failing test case #38
Conversation
|
Thank you for the report, I'm afraid I am away from a computer the next couple of days but I'll take a look when I get back. That being said, my hunch is that the test is panicking somewhere, and this is getting lost as the log facade isn't initialised. Perhaps you might test this out? (As an aside fixing I'll add fixing the lack of panic guards to my list) |
Yeah, I can take a look |
|
K, I think it's fixed. We have to treat |
|
Hmm... That should just work... Something isn't quite right here... FWIW CoalesceBatches is not a repartition operation, it stitches batches together within a partition, so this PR currently will change the behaviour of the plan |
|
K, looked some more and I think the issue is that the output channel closes as soon as any output partition finishes. Pushed a change that will track the active output partitions and close the output channel only if all partitions are done. |
|
Ahah! Yeah that would do it, thanks for investigating 👍 |
|
I fixed this as part of supporting partitioned execution - see apache@505e880 In particular a reduced version of this test case can be found apache@505e880#diff-0005c0590888cdd2c7efd378972ade4f764ec75f4c62e009eb863e4a2bef99f9R378 Thanks again for your help in diagnosing this issue 🏅 |
Looks like parquet sources don't produce any data when scheduled. This is a test case which reproduces.