Skip to content

Conversation

@jinxing64
Copy link

What changes were proposed in this pull request?

This is related to #19330.
As subtask of SPARK-18134, this PR proposes to add a functionality in MapType to sort MapData, in which all entries are already sorted.

How was this patch tested?

Tests added.

@jinxing64
Copy link
Author

@maropu
I want to split #19330 to two parts:

  1. Approach to compare two Maps with themselves already sorted. (This PR)
  2. Approach to sort the entries of a single Map -- SortMaps (A following PR)

Please take a look when you have time.

}

private[this] class OrderedWrapper {
var isOrdered: Boolean = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to make this mutable if we can. That can be a source of some pretty weird errors if we move from an unordered to an ordered map. Why do you need this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for quick reply :)

Actually I'm not pretty sure about this.
If we do it like below

case class MapType(
  keyType: DataType,
  valueType: DataType,
  valueContainsNull: Boolean,
  ordered: Boolean)

The ordered will be spread to lots places in the code (especially in the ...match ... case ... ) and users can will also see it. But I think ordered is a pretty internal parameter/characteristic and only used when sorting map. So I try to make it private and lazy created.

@SparkQA
Copy link

SparkQA commented Oct 13, 2018

Test build #97339 has finished for PR 22712 at commit f8dd1a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

case m: MapType if m.isOrdered && order.direction == Ascending =>
m.interpretedOrdering.asInstanceOf[Ordering[Any]].compare(left, right)
case m: MapType if m.isOrdered && order.direction == Descending =>
m.interpretedOrdering.asInstanceOf[Ordering[Any]].reverse.compare(left, right)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to care about this ordering direction? We just need comparable maps?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is not necessary, but just to make the logic complete.
#9718 did the same thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean we can't remove this? If not necessary, better to remove it off.

@maropu
Copy link
Member

maropu commented Oct 15, 2018

BTW, how does hive implement comparable maps?

@jinxing64
Copy link
Author

jinxing64 commented Oct 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants