[stdlib] Set.intersection iterate over smaller set #36678

BenziAhamed · 2021-03-31T07:48:50Z

When intersecting two sets, it is beneficial to iterate over the smaller sized set of the two, and check membership on the other. This speeds up runtime dramatically for cases where the current set is significantly larger than the set being intersected against.

The following is a small comparison of the current intersection vs the proposed version. As input size increases, the current intersection takes significantly longer.

Note that this change only reduces time for cases where the sizes of the inputs differ significantly.

name                           time     std        iterations
-------------------------------------------------------------
stdlib intersection [10->10]   0.001 ms ±  51.37 %     947716
fast intersection [10->10]     0.001 ms ± 316.51 %     935652 0.9x faster
stdlib intersection [61->10]   0.001 ms ±  49.55 %     861024
fast intersection [61->10]     0.001 ms ±  75.61 %    1000000 1.1x faster
stdlib intersection [625->10]  0.010 ms ±  21.54 %     132798
fast intersection [625->10]    0.001 ms ±  63.79 %    1000000 7.5x faster
stdlib intersection [6357->10] 0.151 ms ±  12.51 %       8811
fast intersection [6357->10]   0.001 ms ±  49.66 %    1000000 113x faster

When intersecting two sets, it is beneficial to iterate over the smaller sized set of the two, and check membership on the other. This speeds up runtime dramatically for cases where the current set is significantly larger than the set you are intersecting against.

stdlib/public/core/Set.swift

lorentey · 2021-04-01T23:19:51Z

@swift-ci test

lorentey · 2021-04-01T23:19:58Z

@swift-ci benchmark

lorentey

Thank you! This looks like a very nice win to me.

I have a long standing PR to optimize these high-level operations using temporary bit sets -- this reminds me it's time we dusted that off and landed it. (It needs some not-entirely-trivial work to be able to call into a newly available type from an inlinable method.)

LucianoPAlmeida · 2021-04-02T00:26:38Z

@swift-ci Please test Windows Platform

stdlib/public/core/Set.swift

swift-ci · 2021-04-02T02:41:20Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
NSStringConversion.UTF8	886	994	+12.2%	0.89x (?)
DictionaryKeysContainsNative	22	24	+9.1%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
Breadcrumbs.MutatedUTF16ToIdx.ASCII	4	3	-25.0%	1.33x (?)
NSStringConversion.MutableCopy.Rebridge	812	748	-7.9%	1.09x (?)
String.data.Medium	114	106	-7.0%	1.08x (?)
StringBuilderWithLongSubstring	1570	1460	-7.0%	1.08x (?)

Code size: -O

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
SetIntersectionBox0	127	187	+47.2%	0.68x
NSError	151	217	+43.7%	0.70x (?)
SetIntersectionBox25	246	305	+24.0%	0.81x
Set.intersection.Seq.Box0	362	422	+16.6%	0.86x
Set.intersection.Seq.Box25	488	546	+11.9%	0.89x (?)
ObjectiveCBridgeFromNSArrayAnyObjectForced	4180	4620	+10.5%	0.90x (?)

Improvement	OLD	NEW	DELTA	RATIO
UTF8Decode_InitFromBytes_ascii	306	279	-8.8%	1.10x (?)
String.data.LargeUnicode	110	101	-8.2%	1.09x (?)

Code size: -Osize

Performance: -Onone

Regression	OLD	NEW	DELTA	RATIO
SetIntersect	1310	1740	+32.8%	0.75x
SetIntersectionInt0	131	174	+32.8%	0.75x
SetIntersectionInt25	311	355	+14.1%	0.88x
SetIntersectionBox0	456	514	+12.7%	0.89x (?)
SetIntersectionInt50	468	509	+8.8%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
StringToDataMedium	6450	5800	-10.1%	1.11x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 8-Core Intel Xeon E5
  Processor Speed: 3 GHz
  Number of Processors: 1
  Total Number of Cores: 8
  L2 Cache (per Core): 256 KB
  L3 Cache: 25 MB
  Memory: 16 GB

swift-ci · 2021-04-02T03:33:56Z

Build failed
Swift Test OS X Platform
Git Sha - b1f77b2

LucianoPAlmeida · 2021-04-08T13:52:22Z

@swift-ci test

swift-ci · 2021-04-08T15:21:35Z

Build failed
Swift Test Linux Platform
Git Sha - 2b10440

xwu · 2021-04-09T03:25:49Z

@swift-ci test Linux platform

xwu · 2021-04-09T03:27:27Z

@swift-ci benchmark

swift-ci · 2021-04-09T04:34:00Z

Performance: -O

Improvement	OLD	NEW	DELTA	RATIO
AngryPhonebook.Armenian.Small	861	801	-7.0%	1.07x (?)

Code size: -O

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
String.data.Medium	105	114	+8.6%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
FlattenListFlatMap	6827	6075	-11.0%	1.12x (?)

Code size: -Osize

Performance: -Onone

Improvement	OLD	NEW	DELTA	RATIO
SetIntersectionBox0	510	473	-7.3%	1.08x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

swift-ci · 2021-04-09T04:49:38Z

Build failed
Swift Test Linux Platform
Git Sha - 2b10440

LucianoPAlmeida · 2021-04-09T12:16:08Z

@swift-ci Please test Linux platform

swift-ci · 2021-04-09T13:51:58Z

Build failed
Swift Test Linux Platform
Git Sha - 2b10440

LucianoPAlmeida · 2021-04-10T23:01:52Z

@swift-ci Please test Linux platform

BenziAhamed changed the title ~~[WIP][stdlib] Set.intersection iterate over smaller set~~ [stdlib] Set.intersection iterate over smaller set Mar 31, 2021

BenziAhamed marked this pull request as ready for review March 31, 2021 07:53

LucianoPAlmeida reviewed Mar 31, 2021

View reviewed changes

stdlib/public/core/Set.swift Outdated Show resolved Hide resolved

LucianoPAlmeida requested review from lorentey and natecook1000 March 31, 2021 12:55

lorentey approved these changes Apr 1, 2021

View reviewed changes

xwu reviewed Apr 2, 2021

View reviewed changes

stdlib/public/core/Set.swift Outdated Show resolved Hide resolved

Review comments - variable names, implicit swap

2b10440

xwu merged commit 4e0c6f9 into swiftlang:main Apr 11, 2021

BenziAhamed deleted the patch-1 branch April 12, 2021 04:21

lorentey mentioned this pull request Nov 2, 2021

[stdlib] Optimize high-level Set operations #40012

Merged

[stdlib] Set.intersection iterate over smaller set #36678

[stdlib] Set.intersection iterate over smaller set #36678

Uh oh!

Conversation

BenziAhamed commented Mar 31, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lorentey commented Apr 1, 2021

Uh oh!

lorentey commented Apr 1, 2021

Uh oh!

lorentey left a comment

Choose a reason for hiding this comment

Uh oh!

LucianoPAlmeida commented Apr 2, 2021

Uh oh!

Uh oh!

swift-ci commented Apr 2, 2021

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

swift-ci commented Apr 2, 2021

Uh oh!

LucianoPAlmeida commented Apr 8, 2021

Uh oh!

swift-ci commented Apr 8, 2021

Uh oh!

xwu commented Apr 9, 2021

Uh oh!

xwu commented Apr 9, 2021

Uh oh!

swift-ci commented Apr 9, 2021

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

swift-ci commented Apr 9, 2021

Uh oh!

LucianoPAlmeida commented Apr 9, 2021

Uh oh!

swift-ci commented Apr 9, 2021

Uh oh!

LucianoPAlmeida commented Apr 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

BenziAhamed commented Mar 31, 2021 •

edited

Loading