[K8S][HELM] Add Spark engine configuration support #4776

dnskr · 2023-04-26T22:39:30Z

Why are the changes needed?

The PR is needed to configure default properties to be used by Apache Spark as query engine.

The PR also changes values.yaml file structure:

# APIs for connectivity and interoperation between supported clients and Kyuubi server
api:
  # Thrift Binary protocol (HiveServer2 compatible)
  thriftBinary:
    ...

# Kyuubi server configuration
server:
  replicas: 2
  ...

# Query engines
engine:
  # Apache Spark default configuration
  spark:
    ...

How was this patch tested?

Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before make a pull request

codecov-commenter · 2023-04-26T23:54:34Z

Codecov Report

Merging #4776 (be227cd) into master (b7012aa) will decrease coverage by 0.03%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##             master    #4776      +/-   ##
============================================
- Coverage     57.99%   57.96%   -0.03%     
  Complexity       13       13              
============================================
  Files           581      581              
  Lines         32431    32431              
  Branches       4309     4309              
============================================
- Hits          18807    18799       -8     
- Misses        11820    11827       +7     
- Partials       1804     1805       +1

see 6 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pan3793 · 2023-04-27T12:09:59Z

Seems we had such an idea about the structure of value.yaml, but decided to reject it.

In practice, if Spark uses HDFS as storage and HMS as metastore, typically, the user should provide hive-site.xml core-site.xml hdfs-site.xml etc. under HADOOP_CONF_DIR, which would be shared by both Kyuubi server and Spark engine (other engines may require it too)

pan3793 · 2023-04-27T12:17:19Z

charts/kyuubi/templates/spark-configmap.yaml

+    spark.kubernetes.container.image.pullPolicy={{ .Values.engine.spark.image.pullPolicy }}
+    spark.kubernetes.container.image.pullSecrets={{ range .Values.imagePullSecrets }}{{ print .name "," }}{{ end }}
+
+    ### Driver resources


IMO this is kind of over-engineering stuff.

One advantage of Kyuubi is, it almost transparently supports all Spark features, users who are familiar w/ Spark, should easy to understand how Kyuubi works and how to configure the Spark engine.

Seems we can put everything in spark-defaults.conf to values.yaml's sparkDefaults block, in Spark native configuration.

You are right, it is over-engineered implementation and it might confuse users about what to configure and where.
I would like to find some balance between convenient basic configuration and flexibility, but obviously it is not the best implementation.

dnskr · 2023-05-10T22:17:30Z

Thanks for the comments!
This is experimental changes and they are not working fully, so I created the PR as draft. Apologize for confusing and for the delayed response.

Seems we had such an idea about the structure of value.yaml, but decided to reject it.

Right, we discussed here. I'll continue with flat structure in a separate PR. As I mentioned above, it is more experimental PR to track my tries and demo different approach.

In practice, if Spark uses HDFS as storage and HMS as metastore, typically, the user should provide hive-site.xml core-site.xml hdfs-site.xml etc. under HADOOP_CONF_DIR, which would be shared by both Kyuubi server and Spark engine (other engines may require it too)

Got it! I'll add these files as well. Am I right that there is no default HADOOP_CONF_DIR path in Kyuubi server or Kyuubi docker image? If no, could you please suggest how to set it in the chart(add env variable, property etc)?

dnskr · 2024-01-09T20:52:14Z

Closed in favor of #5934

[K8S][HELM] Add Spark engine configuration support

be227cd

pan3793 reviewed Apr 27, 2023

View reviewed changes

pan3793 force-pushed the master branch from d8415aa to 9ff46a3 Compare June 8, 2023 12:18

dnskr closed this Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[K8S][HELM] Add Spark engine configuration support #4776

[K8S][HELM] Add Spark engine configuration support #4776

Uh oh!

dnskr commented Apr 26, 2023

Uh oh!

codecov-commenter commented Apr 26, 2023

Uh oh!

pan3793 commented Apr 27, 2023

Uh oh!

pan3793 Apr 27, 2023 •

edited

Loading

Uh oh!

pan3793 Apr 27, 2023

Uh oh!

dnskr May 10, 2023

Uh oh!

dnskr commented May 10, 2023

Uh oh!

dnskr commented Jan 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[K8S][HELM] Add Spark engine configuration support #4776

[K8S][HELM] Add Spark engine configuration support #4776

Uh oh!

Conversation

dnskr commented Apr 26, 2023

Why are the changes needed?

How was this patch tested?

Uh oh!

codecov-commenter commented Apr 26, 2023

Codecov Report

Uh oh!

pan3793 commented Apr 27, 2023

Uh oh!

pan3793 Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pan3793 Apr 27, 2023

Choose a reason for hiding this comment

Uh oh!

dnskr May 10, 2023

Choose a reason for hiding this comment

Uh oh!

dnskr commented May 10, 2023

Uh oh!

dnskr commented Jan 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pan3793 Apr 27, 2023 •

edited

Loading