Skip to content

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Add argument trim for functionstrim/ltrim/rtrim

Why are the changes needed?

this argument is missing in PySpark: we can specify the it in scala side but cannot do it in python.

Does this PR introduce any user-facing change?

yes, new argument supported

How was this patch tested?

added doctests

Was this patch authored or co-authored using generative AI tooling?

no

return _invoke_function_over_columns("ltrim", col)
def ltrim(col: "ColumnOrName", trim: Optional["ColumnOrName"] = None) -> Column:
if trim is not None:
return _invoke_function_over_columns("ltrim", trim, col)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is kind of confusing on the arg order here:
1, connect: _invoke_function_over_columns("ltrim", trim, col) -> follow the order in constructor

def this(trimStr: Expression, srcStr: Expression) = this(srcStr, Option(trimStr))

2, classic: _invoke_function_over_columns("ltrim", col, trim) follows the signature in scala functions

  def ltrim(e: Column, trimString: String): Column
  def ltrim(e: Column, trim: Column): Column

Copy link
Contributor Author

@zhengruifeng zhengruifeng Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably we can unify the function invocations of both connect and classic in some way

@zhengruifeng
Copy link
Contributor Author

thanks, merged to master

@zhengruifeng zhengruifeng deleted the func_trim_str branch October 8, 2024 06:55
* @group string_funcs
* @since 4.0.0
*/
def ltrim(e: Column, trim: Column): Column = Column.fn("ltrim", trim, e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is trim the expression for the trimString? Looks a bit weird to call this parameter trim. It's also not consistent with the function doc of expression StringTrim, which names this parameter as trimStr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortanately, the argument names were already not consistent between python and scala, in many functions.

in this PR I use trim to be consistent with python side btrim

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def btrim(str: "ColumnOrName", trim: Optional["ColumnOrName"] = None) -> Column:

himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…rim`

### What changes were proposed in this pull request?
Add argument `trim` for functions`trim/ltrim/rtrim`

### Why are the changes needed?
this argument is missing in PySpark: we can specify the it in scala side but cannot do it in python.

### Does this PR introduce _any_ user-facing change?
yes, new argument supported

### How was this patch tested?
added doctests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes apache#48363 from zhengruifeng/func_trim_str.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants