Skip to content

Conversation

@nchammas
Copy link
Contributor

@nchammas nchammas commented Feb 4, 2020

What changes were proposed in this pull request?

This PR adds a doc builder for Spark SQL's configuration options.

Here's what the new Spark SQL config docs look like (configuration.html.zip):

Screen Shot 2020-02-07 at 12 13 23 PM

Compare this to the current docs:

Screen Shot 2020-02-04 at 4 55 10 PM

Why are the changes needed?

There is no visibility into the various Spark SQL configs on the config docs page.

Does this PR introduce any user-facing change?

No, apart from new documentation.

How was this patch tested?

I tested this manually by building the docs and reviewing them in my browser.

@SparkQA
Copy link

SparkQA commented Feb 4, 2020

Test build #117859 has finished for PR 27459 at commit 7742fc1.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor Author

@nchammas nchammas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference to #18702, @HyukjinKwon. It was very helpful in getting me started.

@nchammas nchammas requested a review from HyukjinKwon February 4, 2020 21:52
@SparkQA
Copy link

SparkQA commented Feb 4, 2020

Test build #117860 has finished for PR 27459 at commit a3b6d19.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 4, 2020

Test build #117862 has finished for PR 27459 at commit b80e6f2.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117864 has finished for PR 27459 at commit 914630f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@HyukjinKwon HyukjinKwon changed the title [SPARK-30510][SQL][Docs] Publicly document Spark SQL configuration options [SPARK-30510][SQL][DOCS] Publicly document Spark SQL configuration options Feb 5, 2020
Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a test to ensure the internal SQLConf will not be added to the generated doc?

@gatorsmile
Copy link
Member

@nchammas Good work! To make it easier to review, could you attach the generated doc in the PR description?

@nchammas
Copy link
Contributor Author

nchammas commented Feb 5, 2020

@gatorsmile

To make it easier to review, could you attach the generated doc in the PR description?

Done.

@nchammas
Copy link
Contributor Author

nchammas commented Feb 5, 2020

Could we have a test to ensure the internal SQLConf will not be added to the generated doc?

Perhaps we should just add a test that confirms that one or two specific internal configs are not in the output of SQLConf.getAllDefinedConfs().

Or maybe, if the concern is strictly about the docs, the test should be against the newly added _list_sql_configs()?

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117868 has finished for PR 27459 at commit 914630f.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise.

@nchammas
Copy link
Contributor Author

nchammas commented Feb 7, 2020

I've updated the screenshot and attached HTML in the PR description to match the latest output.

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118042 has finished for PR 27459 at commit 4c32cf2.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@nchammas
Copy link
Contributor Author

nchammas commented Feb 7, 2020

Retest this please.

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118044 has finished for PR 27459 at commit 4c32cf2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 8, 2020

Test build #118048 has finished for PR 27459 at commit b08bac4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.0.

Thanks for working on this, @nchammas

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Feb 9, 2020

cc @rxin, @cloud-fan, @gatorsmile, @dongjoon-hyun, @srowen (who I remember I talked about this). Now all external SQL configurations are documented automatically.

HyukjinKwon pushed a commit that referenced this pull request Feb 9, 2020
…tions

### What changes were proposed in this pull request?

This PR adds a doc builder for Spark SQL's configuration options.

Here's what the new Spark SQL config docs look like ([configuration.html.zip](https://github.com/apache/spark/files/4172109/configuration.html.zip)):

![Screen Shot 2020-02-07 at 12 13 23 PM](https://user-images.githubusercontent.com/1039369/74050007-425b5480-49a3-11ea-818c-42700c54d1fb.png)

Compare this to the [current docs](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql):

![Screen Shot 2020-02-04 at 4 55 10 PM](https://user-images.githubusercontent.com/1039369/73790828-24a5a980-476f-11ea-998c-12cd613883e8.png)

### Why are the changes needed?

There is no visibility into the various Spark SQL configs on [the config docs page](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql).

### Does this PR introduce any user-facing change?

No, apart from new documentation.

### How was this patch tested?

I tested this manually by building the docs and reviewing them in my browser.

Closes #27459 from nchammas/SPARK-30510-spark-sql-options.

Authored-by: Nicholas Chammas <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 339c0f9)
Signed-off-by: HyukjinKwon <[email protected]>
@nchammas nchammas deleted the SPARK-30510-spark-sql-options branch February 10, 2020 01:27
@gatorsmile
Copy link
Member

If possible, we can add the version info in each SQLConf and add it into this doc too?

cc @beliefer Are you willing to do this?

@beliefer
Copy link
Contributor

@gatorsmile Thanks for your call. I will take a look.

srowen pushed a commit that referenced this pull request Mar 7, 2020
### What changes were proposed in this pull request?

This PR makes the following refinements to the workflow for building docs:
* Install Python and Ruby consistently using pyenv and rbenv across both the docs README and the release Dockerfile.
* Pin the Python and Ruby versions we use.
* Pin all direct Python and Ruby dependency versions.
* Eliminate any use of `sudo pip`, which the Python community discourages, or `sudo gem`.

### Why are the changes needed?

This PR should increase the consistency and reproducibility of the doc-building process by managing Python and Ruby in a more consistent way, and by eliminating unused or outdated code.

Here's a possible example of an issue building the docs that would be addressed by the changes in this PR: #27459 (comment)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manual tests:
* I was able to build the Docker image successfully, minus the final part about `RUN useradd`.
    * I am unable to run `do-release-docker.sh` because I am not a committer and don't have the required GPG key.
* I built the docs locally and viewed them in the browser.

I think I need a committer to more fully test out these changes.

Closes #27534 from nchammas/SPARK-30731-building-docs.

Authored-by: Nicholas Chammas <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…tions

### What changes were proposed in this pull request?

This PR adds a doc builder for Spark SQL's configuration options.

Here's what the new Spark SQL config docs look like ([configuration.html.zip](https://github.com/apache/spark/files/4172109/configuration.html.zip)):

![Screen Shot 2020-02-07 at 12 13 23 PM](https://user-images.githubusercontent.com/1039369/74050007-425b5480-49a3-11ea-818c-42700c54d1fb.png)

Compare this to the [current docs](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql):

![Screen Shot 2020-02-04 at 4 55 10 PM](https://user-images.githubusercontent.com/1039369/73790828-24a5a980-476f-11ea-998c-12cd613883e8.png)

### Why are the changes needed?

There is no visibility into the various Spark SQL configs on [the config docs page](http://spark.apache.org/docs/3.0.0-preview2/configuration.html#spark-sql).

### Does this PR introduce any user-facing change?

No, apart from new documentation.

### How was this patch tested?

I tested this manually by building the docs and reviewing them in my browser.

Closes apache#27459 from nchammas/SPARK-30510-spark-sql-options.

Authored-by: Nicholas Chammas <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?

This PR makes the following refinements to the workflow for building docs:
* Install Python and Ruby consistently using pyenv and rbenv across both the docs README and the release Dockerfile.
* Pin the Python and Ruby versions we use.
* Pin all direct Python and Ruby dependency versions.
* Eliminate any use of `sudo pip`, which the Python community discourages, or `sudo gem`.

### Why are the changes needed?

This PR should increase the consistency and reproducibility of the doc-building process by managing Python and Ruby in a more consistent way, and by eliminating unused or outdated code.

Here's a possible example of an issue building the docs that would be addressed by the changes in this PR: apache#27459 (comment)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manual tests:
* I was able to build the Docker image successfully, minus the final part about `RUN useradd`.
    * I am unable to run `do-release-docker.sh` because I am not a committer and don't have the required GPG key.
* I built the docs locally and viewed them in the browser.

I think I need a committer to more fully test out these changes.

Closes apache#27534 from nchammas/SPARK-30731-building-docs.

Authored-by: Nicholas Chammas <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants