-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-40229][PS][TEST] Re-enable excel I/O test for pandas API on Spark #37671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The I failed to reproduce this error although I use the same version of related envs (e.g. Let me leave the re-enabling |
|
Just created a ticket for re-enabling the |
|
cc @HyukjinKwon FYI |
.github/workflows/build_and_test.yml
Outdated
| - name: Install Python packages (Python 3.9, PyPy3) | ||
| run: | | ||
| # To test excel I/O for pandas API on Spark. | ||
| python3.9 -m pip install openpyxl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add this into Dockerfile? https://github.com/apache/spark/blob/master/dev/infra/Dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Just moved
|
cc @xinrong-meng FYI (I will be off for the rest of this week) |
| } | ||
|
|
||
| @unittest.skip("openpyxl") | ||
| def test_to_excel(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually PyPy is not tested with Pandas API on Spark per https://github.com/apache/spark/blob/master/dev/sparktestsupport/modules.py#L663-L667 because pyarrow, etc are not available with PyPy. Should probably docunent it somewhere but let's do that separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good to know!
|
Merged to master. |
### What changes were proposed in this pull request? This is a follow-up of #37671. ### Why are the changes needed? Since #37671 added `openpyxl` for PySpark test environments and re-enabled `test_to_excel` test, we need to add it to `requirements.txt` as PySpark test dependency explicitly. ### Does this PR introduce _any_ user-facing change? No. This is a test dependency. ### How was this patch tested? Manually. Closes #38425 from dongjoon-hyun/SPARK-40229. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
### What changes were proposed in this pull request? This is a follow-up of apache#37671. ### Why are the changes needed? Since apache#37671 added `openpyxl` for PySpark test environments and re-enabled `test_to_excel` test, we need to add it to `requirements.txt` as PySpark test dependency explicitly. ### Does this PR introduce _any_ user-facing change? No. This is a test dependency. ### How was this patch tested? Manually. Closes apache#38425 from dongjoon-hyun/SPARK-40229. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Yikun Jiang <[email protected]>
What changes were proposed in this pull request?
This PR proposes to install the
openpyxlfor PySpark test environments to re-enable theto_exceltests.Why are the changes needed?
For better test coverage
Does this PR introduce any user-facing change?
No, it's test only
How was this patch tested?
Enabling the existing skipping tests related to
openpyxl.