-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-25476][SPARK-25510][TEST] Refactor AggregateBenchmark and add a new trait to better support Dataset and DataFrame API #22484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
649f296
Refactor AggregateBenchmark
wangyum e0c6d09
Merge remote-tracking branch 'upstream/master' into SPARK-25476
wangyum 42230b6
Add RunBenchmarkWithCodegen
wangyum 2d778a4
Rename RunBenchmarkWithCodegen to SqlBasedBenchmark
wangyum 536d4e9
Add withSQLConf to SqlBasedBenchmark
wangyum f783099
true.toString -> "true" and false.toString -> "false"
wangyum d5fecdc
Fix indentation
wangyum 23f1bd8
Merge remote-tracking branch 'upstream/master' into SPARK-25476
wangyum 0fae54d
use SQLHelper
wangyum 45add92
Update result (#12)
dongjoon-hyun 3439992
Fix string variable
wangyum 6c46ad5
Address comments
wangyum File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| ================================================================================================ | ||
| aggregate without grouping | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| agg w/o group: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| agg w/o group wholestage off 65374 / 70665 32.1 31.2 1.0X | ||
| agg w/o group wholestage on 1178 / 1209 1779.8 0.6 55.5X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| stat functions | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| stddev: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| stddev wholestage off 8667 / 8851 12.1 82.7 1.0X | ||
| stddev wholestage on 1266 / 1273 82.8 12.1 6.8X | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| kurtosis: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| kurtosis wholestage off 41218 / 41231 2.5 393.1 1.0X | ||
| kurtosis wholestage on 1347 / 1357 77.8 12.8 30.6X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| aggregate with linear keys | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| Aggregate w keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| codegen = F 9309 / 9389 9.0 111.0 1.0X | ||
| codegen = T hashmap = F 4417 / 4435 19.0 52.7 2.1X | ||
| codegen = T hashmap = T 1289 / 1298 65.1 15.4 7.2X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| aggregate with randomized keys | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| Aggregate w keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| codegen = F 11424 / 11426 7.3 136.2 1.0X | ||
| codegen = T hashmap = F 6441 / 6496 13.0 76.8 1.8X | ||
| codegen = T hashmap = T 2333 / 2344 36.0 27.8 4.9X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| aggregate with string key | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| Aggregate w string key: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| codegen = F 4751 / 4890 4.4 226.5 1.0X | ||
| codegen = T hashmap = F 3146 / 3182 6.7 150.0 1.5X | ||
| codegen = T hashmap = T 2211 / 2261 9.5 105.4 2.1X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| aggregate with decimal key | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| Aggregate w decimal key: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| codegen = F 3029 / 3062 6.9 144.4 1.0X | ||
| codegen = T hashmap = F 1534 / 1569 13.7 73.2 2.0X | ||
| codegen = T hashmap = T 575 / 578 36.5 27.4 5.3X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| aggregate with multiple key types | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| Aggregate w multiple keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| codegen = F 7506 / 7521 2.8 357.9 1.0X | ||
| codegen = T hashmap = F 4791 / 4808 4.4 228.5 1.6X | ||
| codegen = T hashmap = T 3553 / 3585 5.9 169.4 2.1X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| max function bytecode size of wholestagecodegen | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| max function bytecode size: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| codegen = F 608 / 656 1.1 927.1 1.0X | ||
| codegen = T hugeMethodLimit = 10000 402 / 419 1.6 613.5 1.5X | ||
| codegen = T hugeMethodLimit = 1500 616 / 619 1.1 939.9 1.0X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| cube | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| cube: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| cube wholestage off 3229 / 3237 1.6 615.9 1.0X | ||
| cube wholestage on 1285 / 1306 4.1 245.2 2.5X | ||
|
|
||
|
|
||
| ================================================================================================ | ||
| hash and BytesToBytesMap | ||
| ================================================================================================ | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 | ||
| Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz | ||
| BytesToBytesMap: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| UnsafeRowhash 328 / 330 64.0 15.6 1.0X | ||
| murmur3 hash 167 / 167 125.4 8.0 2.0X | ||
| fast hash 84 / 85 249.0 4.0 3.9X | ||
| arrayEqual 192 / 192 109.3 9.1 1.7X | ||
| Java HashMap (Long) 144 / 147 145.9 6.9 2.3X | ||
| Java HashMap (two ints) 147 / 153 142.3 7.0 2.2X | ||
| Java HashMap (UnsafeRow) 785 / 788 26.7 37.4 0.4X | ||
| LongToUnsafeRowMap (opt=false) 456 / 457 46.0 21.8 0.7X | ||
| LongToUnsafeRowMap (opt=true) 125 / 125 168.3 5.9 2.6X | ||
| BytesToBytesMap (off Heap) 885 / 885 23.7 42.2 0.4X | ||
| BytesToBytesMap (on Heap) 860 / 864 24.4 41.0 0.4X | ||
| Aggregate HashMap 56 / 56 373.9 2.7 5.8X | ||
|
|
||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davies Do you know how to generate there benchmark:
spark/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala
Lines 70 to 78 in 3c3eebc