-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-30510][SQL][DOCS] Publicly document Spark SQL configuration options #27459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #117859 has finished for PR 27459 at commit
|
nchammas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reference to #18702, @HyukjinKwon. It was very helpful in getting me started.
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala
Show resolved
Hide resolved
|
Test build #117860 has finished for PR 27459 at commit
|
|
Test build #117862 has finished for PR 27459 at commit
|
|
Test build #117864 has finished for PR 27459 at commit
|
|
retest this please |
gatorsmile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a test to ensure the internal SQLConf will not be added to the generated doc?
|
@nchammas Good work! To make it easier to review, could you attach the generated doc in the PR description? |
Done. |
Perhaps we should just add a test that confirms that one or two specific internal configs are not in the output of Or maybe, if the concern is strictly about the docs, the test should be against the newly added |
|
Test build #117868 has finished for PR 27459 at commit
|
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good otherwise.
|
I've updated the screenshot and attached HTML in the PR description to match the latest output. |
|
Test build #118042 has finished for PR 27459 at commit
|
|
Retest this please. |
|
Test build #118044 has finished for PR 27459 at commit
|
|
Test build #118048 has finished for PR 27459 at commit
|
|
Merged to master and branch-3.0. Thanks for working on this, @nchammas |
|
cc @rxin, @cloud-fan, @gatorsmile, @dongjoon-hyun, @srowen (who I remember I talked about this). Now all external SQL configurations are documented automatically. |
…tions ### What changes were proposed in this pull request? This PR adds a doc builder for Spark SQL's configuration options. Here's what the new Spark SQL config docs look like ([configuration.html.zip](https://github.com/apache/spark/files/4172109/configuration.html.zip)):  Compare this to the [current docs](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql):  ### Why are the changes needed? There is no visibility into the various Spark SQL configs on [the config docs page](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql). ### Does this PR introduce any user-facing change? No, apart from new documentation. ### How was this patch tested? I tested this manually by building the docs and reviewing them in my browser. Closes #27459 from nchammas/SPARK-30510-spark-sql-options. Authored-by: Nicholas Chammas <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> (cherry picked from commit 339c0f9) Signed-off-by: HyukjinKwon <[email protected]>
|
If possible, we can add the version info in each SQLConf and add it into this doc too? cc @beliefer Are you willing to do this? |
|
@gatorsmile Thanks for your call. I will take a look. |
### What changes were proposed in this pull request? This PR makes the following refinements to the workflow for building docs: * Install Python and Ruby consistently using pyenv and rbenv across both the docs README and the release Dockerfile. * Pin the Python and Ruby versions we use. * Pin all direct Python and Ruby dependency versions. * Eliminate any use of `sudo pip`, which the Python community discourages, or `sudo gem`. ### Why are the changes needed? This PR should increase the consistency and reproducibility of the doc-building process by managing Python and Ruby in a more consistent way, and by eliminating unused or outdated code. Here's a possible example of an issue building the docs that would be addressed by the changes in this PR: #27459 (comment) ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manual tests: * I was able to build the Docker image successfully, minus the final part about `RUN useradd`. * I am unable to run `do-release-docker.sh` because I am not a committer and don't have the required GPG key. * I built the docs locally and viewed them in the browser. I think I need a committer to more fully test out these changes. Closes #27534 from nchammas/SPARK-30731-building-docs. Authored-by: Nicholas Chammas <[email protected]> Signed-off-by: Sean Owen <[email protected]>
…tions ### What changes were proposed in this pull request? This PR adds a doc builder for Spark SQL's configuration options. Here's what the new Spark SQL config docs look like ([configuration.html.zip](https://github.com/apache/spark/files/4172109/configuration.html.zip)):  Compare this to the [current docs](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql):  ### Why are the changes needed? There is no visibility into the various Spark SQL configs on [the config docs page](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql). ### Does this PR introduce any user-facing change? No, apart from new documentation. ### How was this patch tested? I tested this manually by building the docs and reviewing them in my browser. Closes apache#27459 from nchammas/SPARK-30510-spark-sql-options. Authored-by: Nicholas Chammas <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
### What changes were proposed in this pull request? This PR makes the following refinements to the workflow for building docs: * Install Python and Ruby consistently using pyenv and rbenv across both the docs README and the release Dockerfile. * Pin the Python and Ruby versions we use. * Pin all direct Python and Ruby dependency versions. * Eliminate any use of `sudo pip`, which the Python community discourages, or `sudo gem`. ### Why are the changes needed? This PR should increase the consistency and reproducibility of the doc-building process by managing Python and Ruby in a more consistent way, and by eliminating unused or outdated code. Here's a possible example of an issue building the docs that would be addressed by the changes in this PR: apache#27459 (comment) ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manual tests: * I was able to build the Docker image successfully, minus the final part about `RUN useradd`. * I am unable to run `do-release-docker.sh` because I am not a committer and don't have the required GPG key. * I built the docs locally and viewed them in the browser. I think I need a committer to more fully test out these changes. Closes apache#27534 from nchammas/SPARK-30731-building-docs. Authored-by: Nicholas Chammas <[email protected]> Signed-off-by: Sean Owen <[email protected]>
What changes were proposed in this pull request?
This PR adds a doc builder for Spark SQL's configuration options.
Here's what the new Spark SQL config docs look like (configuration.html.zip):
Compare this to the current docs:
Why are the changes needed?
There is no visibility into the various Spark SQL configs on the config docs page.
Does this PR introduce any user-facing change?
No, apart from new documentation.
How was this patch tested?
I tested this manually by building the docs and reviewing them in my browser.