-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5282][mllib]: RowMatrix easily gets int overflow in the memory size warning #4069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #25657 has started for PR 4069 at commit
|
|
Test build #25657 has finished for PR 4069 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's minor, but I think it might be simpler to report megabytes in the warning message, since the reported size is at least 800MB. Then this can be simply val memMB = (cols.toLong * cols) / 125000; that multiplication can't overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely I can do it. I was worrying some one may ask for the accurate number in the PR~~ :)
Another thing is, /125000 might should be changed to >> 15?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>> 17 ? a mebibyte is 2^20 bytes so that would make sense. I don't know what the overall stance on mega vs mebibyte is in the nomenclature here but think everything is megabytes, in which case / 125000 is right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, 17...I'll send update tomorrow morning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this 1000/1024 thing is why they send actual bytes in the beginning... :)
|
Test build #25700 has started for PR 4069 at commit
|
|
Test build #25700 has finished for PR 4069 at commit
|
|
Test PASSed. |
|
@srowen Would you mind to take another look? Thanks |
|
LGTM |
|
Thanks. |
… size warning JIRA: https://issues.apache.org/jira/browse/SPARK-5282 fix the possible int overflow in the memory computation warning Author: Yuhao Yang <[email protected]> Closes #4069 from hhbyyh/addscStop and squashes the following commits: e54e5c8 [Yuhao Yang] change to MB based number 7afac23 [Yuhao Yang] 5282: fix int overflow in the warning (cherry picked from commit 4432568) Signed-off-by: Xiangrui Meng <[email protected]>
|
Merged into master and branch-1.2. Thanks! |
JIRA: https://issues.apache.org/jira/browse/SPARK-5282
fix the possible int overflow in the memory computation warning