Skip to content

Conversation

@BenziAhamed
Copy link
Contributor

@BenziAhamed BenziAhamed commented Mar 31, 2021

When intersecting two sets, it is beneficial to iterate over the smaller sized set of the two, and check membership on the other. This speeds up runtime dramatically for cases where the current set is significantly larger than the set being intersected against.

The following is a small comparison of the current intersection vs the proposed version. As input size increases, the current intersection takes significantly longer.

Note that this change only reduces time for cases where the sizes of the inputs differ significantly.

name                           time     std        iterations
-------------------------------------------------------------
stdlib intersection [10->10]   0.001 ms ±  51.37 %     947716
fast intersection [10->10]     0.001 ms ± 316.51 %     935652 0.9x faster
stdlib intersection [61->10]   0.001 ms ±  49.55 %     861024
fast intersection [61->10]     0.001 ms ±  75.61 %    1000000 1.1x faster
stdlib intersection [625->10]  0.010 ms ±  21.54 %     132798
fast intersection [625->10]    0.001 ms ±  63.79 %    1000000 7.5x faster
stdlib intersection [6357->10] 0.151 ms ±  12.51 %       8811
fast intersection [6357->10]   0.001 ms ±  49.66 %    1000000 113x faster

When intersecting two sets, it is beneficial to iterate over the smaller sized set of the two, and check membership on the other. This speeds up runtime dramatically for cases where the current set is significantly larger than the set you are intersecting against.
@BenziAhamed BenziAhamed changed the title [WIP][stdlib] Set.intersection iterate over smaller set [stdlib] Set.intersection iterate over smaller set Mar 31, 2021
@BenziAhamed BenziAhamed marked this pull request as ready for review March 31, 2021 07:53
@lorentey
Copy link
Member

lorentey commented Apr 1, 2021

@swift-ci test

@lorentey
Copy link
Member

lorentey commented Apr 1, 2021

@swift-ci benchmark

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This looks like a very nice win to me.

I have a long standing PR to optimize these high-level operations using temporary bit sets -- this reminds me it's time we dusted that off and landed it. (It needs some not-entirely-trivial work to be able to call into a newly available type from an inlinable method.)

@LucianoPAlmeida
Copy link
Contributor

@swift-ci Please test Windows Platform

@swift-ci
Copy link
Contributor

swift-ci commented Apr 2, 2021

Performance: -O

Regression OLD NEW DELTA RATIO
NSStringConversion.UTF8 886 994 +12.2% 0.89x (?)
DictionaryKeysContainsNative 22 24 +9.1% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
Breadcrumbs.MutatedUTF16ToIdx.ASCII 4 3 -25.0% 1.33x (?)
NSStringConversion.MutableCopy.Rebridge 812 748 -7.9% 1.09x (?)
String.data.Medium 114 106 -7.0% 1.08x (?)
StringBuilderWithLongSubstring 1570 1460 -7.0% 1.08x (?)

Code size: -O

Performance: -Osize

Regression OLD NEW DELTA RATIO
SetIntersectionBox0 127 187 +47.2% 0.68x
NSError 151 217 +43.7% 0.70x (?)
SetIntersectionBox25 246 305 +24.0% 0.81x
Set.intersection.Seq.Box0 362 422 +16.6% 0.86x
Set.intersection.Seq.Box25 488 546 +11.9% 0.89x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced 4180 4620 +10.5% 0.90x (?)
 
Improvement OLD NEW DELTA RATIO
UTF8Decode_InitFromBytes_ascii 306 279 -8.8% 1.10x (?)
String.data.LargeUnicode 110 101 -8.2% 1.09x (?)

Code size: -Osize

Performance: -Onone

Regression OLD NEW DELTA RATIO
SetIntersect 1310 1740 +32.8% 0.75x
SetIntersectionInt0 131 174 +32.8% 0.75x
SetIntersectionInt25 311 355 +14.1% 0.88x
SetIntersectionBox0 456 514 +12.7% 0.89x (?)
SetIntersectionInt50 468 509 +8.8% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
StringToDataMedium 6450 5800 -10.1% 1.11x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 8-Core Intel Xeon E5
  Processor Speed: 3 GHz
  Number of Processors: 1
  Total Number of Cores: 8
  L2 Cache (per Core): 256 KB
  L3 Cache: 25 MB
  Memory: 16 GB

@swift-ci
Copy link
Contributor

swift-ci commented Apr 2, 2021

Build failed
Swift Test OS X Platform
Git Sha - b1f77b2

@LucianoPAlmeida
Copy link
Contributor

@swift-ci test

@swift-ci
Copy link
Contributor

swift-ci commented Apr 8, 2021

Build failed
Swift Test Linux Platform
Git Sha - 2b10440

@xwu
Copy link
Collaborator

xwu commented Apr 9, 2021

@swift-ci test Linux platform

@xwu
Copy link
Collaborator

xwu commented Apr 9, 2021

@swift-ci benchmark

@swift-ci
Copy link
Contributor

swift-ci commented Apr 9, 2021

Performance: -O

Improvement OLD NEW DELTA RATIO
AngryPhonebook.Armenian.Small 861 801 -7.0% 1.07x (?)

Code size: -O

Performance: -Osize

Regression OLD NEW DELTA RATIO
String.data.Medium 105 114 +8.6% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
FlattenListFlatMap 6827 6075 -11.0% 1.12x (?)

Code size: -Osize

Performance: -Onone

Improvement OLD NEW DELTA RATIO
SetIntersectionBox0 510 473 -7.3% 1.08x (?)

Code size: -swiftlibs

How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@swift-ci
Copy link
Contributor

swift-ci commented Apr 9, 2021

Build failed
Swift Test Linux Platform
Git Sha - 2b10440

@LucianoPAlmeida
Copy link
Contributor

@swift-ci Please test Linux platform

@swift-ci
Copy link
Contributor

swift-ci commented Apr 9, 2021

Build failed
Swift Test Linux Platform
Git Sha - 2b10440

@LucianoPAlmeida
Copy link
Contributor

@swift-ci Please test Linux platform

@xwu xwu merged commit 4e0c6f9 into swiftlang:main Apr 11, 2021
@BenziAhamed BenziAhamed deleted the patch-1 branch April 12, 2021 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants