Skip to content

Conversation

WweiL
Copy link
Contributor

@WweiL WweiL commented Jun 8, 2024

What changes were proposed in this pull request?

  1. In connect, when a streaming query name is not specified, it's query.name should return None. Currently it returns an empty string without this patch.
  2. In classic spark, one cannot set the streaming query's name to be empty string. This check was missing in Spark Connect. Adding it back.

Why are the changes needed?

Edge case handling.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added unit test.

Was this patch authored or co-authored using generative AI tooling?

No.

partitionBy.__doc__ = PySparkDataStreamWriter.partitionBy.__doc__

def queryName(self, queryName: str) -> "DataStreamWriter":
if not queryName or type(queryName) != str or len(queryName.strip()) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably use not isinstance(queryName, str)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, i see. it's consistent with spark classic. okie lgtm

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants