Skip to content

Conversation

@squito
Copy link
Contributor

@squito squito commented Oct 12, 2018

JVMs don't you allocate arrays of length exactly Int.MaxValue, so leave
a little extra room. This is necessary when reading blocks >2GB off
the network (for remote reads or for cache replication).

Unit tests via jenkins, ran a test with blocks over 2gb on a cluster

JVMs don't you allocate arrays of length exactly Int.MaxValue, so leave
a little extra room.  This is necessary when reading blocks >2GB off
the network (for remote reads or for cache replication).
@SparkQA
Copy link

SparkQA commented Oct 12, 2018

Test build #97290 has finished for PR 22705 at commit cb07bad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// Though in theory you should be able to index into an array of size Int.MaxValue, in practice
// jvms don't let you go up to limit. It seems you may only need - 2, but we leave a little
// extra room.
val maxArraySize = Int.MaxValue - 512
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the max is safely more like Int.MaxValue - 5 from things like https://www.quora.com/What-is-the-maximum-size-of-the-array-in-Java You could probably push it further, but whatever.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already had seen the similar problem in other places. Can we use this value here, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great suggestion, thanks, just updated

@SparkQA
Copy link

SparkQA commented Oct 15, 2018

Test build #97399 has finished for PR 22705 at commit c811476.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito squito changed the title [SPARK-25704][CORE][WIP] Allocate a bit less than Int.MaxValue [SPARK-25704][CORE] Allocate a bit less than Int.MaxValue Oct 15, 2018
@squito
Copy link
Contributor Author

squito commented Oct 16, 2018

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 16, 2018

Test build #97423 has finished for PR 22705 at commit c811476.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Oct 16, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Oct 16, 2018

Test build #97465 has finished for PR 22705 at commit c811476.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Oct 17, 2018

LGTM

asfgit pushed a commit that referenced this pull request Oct 19, 2018
JVMs don't you allocate arrays of length exactly Int.MaxValue, so leave
a little extra room.  This is necessary when reading blocks >2GB off
the network (for remote reads or for cache replication).

Unit tests via jenkins, ran a test with blocks over 2gb on a cluster

Closes #22705 from squito/SPARK-25704.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
@squito
Copy link
Contributor Author

squito commented Oct 19, 2018

merged to master / 2.4.

Something went wrong w/ the merge script when putting it on 2.4 (I think I accidentally had a dirty working dir at the time), but it was actually a clean cherry pick, so I pushed it manually here:
1001d23

@asfgit asfgit closed this in 43717de Oct 19, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
JVMs don't you allocate arrays of length exactly Int.MaxValue, so leave
a little extra room.  This is necessary when reading blocks >2GB off
the network (for remote reads or for cache replication).

Unit tests via jenkins, ran a test with blocks over 2gb on a cluster

Closes apache#22705 from squito/SPARK-25704.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants