Implement optional strings-only shallow copies for perf #1188
+158
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR:
StrictModeshallow copying option:"with_symbols"is an alias for the currentstrict: falsebehavior"full"is an alias for the currentstrict: truebehavior"strings_only"is a new option that only copies string properties for objectsshallowCopyto implement the"strings_only"option usingObject.keys(base).forEach(key => copy[key] = base[key]).Copying via
Object.keys()appears to be similar enough in perf at small object sizes (<1000 keys), but drastically faster at large object sizes (>1020 keys), at least in v8. This is due to v8 implementing a "fast properties" mode that caps out at 1020 properties, at which point it changes into "dictionary mode".Background
In looking at different immutable update libraries, I saw that Mutative implements shallow copying objects like this:
I was curious, and tried this out in the existing Immer benchmarks suite. I didn't see much difference. However, I have an RTK Query stress test benchmark that loads 1000+ components at once, and in that
shallowCopyis often a major bottleneck. In some of my experiments, switching toObject.keys()cut the shallow copy time significantly. However, the results across the main Immer benchmarks suite, my RTKQ app benchmark, low-level microbenchmarks for object copying techniques, and a standalone "RTKQ update scenario" script have had varying results. I'm still not entirely sure of what I'm seeing, but I've learned some related pieces of info.Large Object / RTKQ Benchmarks
I'd added a "large objects" scenario to the benchmarks suite, which created an object with 1000 properties. Using
keys().forEach()didn't seem to improve that. Same for the "RTKQ updates" benchmark scenario, which ran 100 pairs of updates. However, I later created a standalonertkq-testing.mjsscript (included in this PR) that went from 100 to 3000 updates, and there I saw major differences. Some examples:Baseline Spread
Here, the "item" is really pairs of
pending/resolvedpseudo-actions, wherependingadds a new key to thestate.api.queriesobject, andresolvedupdates that value.Note that the time per update keeps increasing. It's not linear - if it was, we'd see (13.1 * 10 = 131ms) for 1000 items, instead we see 1490. It's clear that the larger the objects get, the longer the updates take.
I tried a whole bunch of variations on object updates, but I'll jump ahead and show the one that looks the best:
Always Obj.keys().forEach(), No Symbols
Note that the time-per-item still increases, but more slowly, and the overall time gets cut dramatically.
All the other variations run through the same benchmark
Larger Object Benchmarks
Based on that, I tried modifying the "largeObjects" benchmark, playing around with various object sizes. In the process, I specifically started to see varying behavior for that one benchmark scenario as I bumped the size past ~1000 properties. At 1000,
{...base}was faster andObject.keys()was slower. At 1500, they flipped. I experimented and narrowed it down to something like 1012 or 1020 as the breakpoint.Based on this I concluded that I had to be hitting some kind of built-in limit in v8.
v8 Fast Properties
I had run across mentions of v8's "fast properties" implementation before, where it optimizes property access for certain kinds of objects:
However, those gave no details on what the real triggers were for fast properties, or any specific size limits.
This evening I tried asking around online, and I quickly got some key pointers to the relevant v8 behavior and sources:
v8/src/objects/property-details.hThat snippet is:
and if you do the math,
(1 << 10) - 4is 1020.This was confirmed by one of the tests, which also specifically hardcodes 1020 as the expected max properties before an object converts from "fast property" mode into "dictionary" mode:
v8/test/dictionary-prototypes.jsThat matches the "around 1000" breakpoint I was seeing.
I also was pointed to these SO entries, which have some fantastic writeups from v8 devs explaining the nuances of the different modes:
Variations I've Tried
Without copy-pasting all of them:
for (var _key in obj) size++counter loopObject.keys()just by itselfkeys.forEach()Object.getOwnPropertySymbols(). Note that we have a couple tests that assert we do copy symbols by default. Leaving outObject.getOwnPropertySymbols()is faster, but fails those tests.Object.keys()for the initial size check, and then reusing it for the copyReflect.ownKeys()Is This An Improvement?
I'm actually pretty confused at this point :)
update-largeObject2) has over 1020 properties in an object. That example does seem to get faster with this changereducer()andshallowCopy()specifically... but I'm re-running tests with that app tonight with this change on top of both 10.2 and Rewrite finalization system to use a callback approach instead of tree traversal #1183 , and I'm not seeing the big obvious changes I was before. Some, possibly, but not as much.rtkq-testing.mjsthat tries to replicate the same update sequence always shows a drastic improvement withObject.keys().forEach()This PR
For this PR, I updated the
StrictModetype to be more of an extended enum. Ultimately it's really adding a"strings_only"option, which triggers theObject.keys().forEach()path. Otherwise, it defaults to{...base}, which includes symbols. The other enum options are essentially aliases for existing behavior.My thought is that we could ship this in a 10.3 minor, as it should leave the existing behavior unchanged, and users can opt in to the keys/strings-only approach if desired. Then, if this works well, we could flip the defaults in v11. That does also bring into question whether Immer should default to copying symbols for correctness, or just strings for performance. We could also remove the boolean options in v11 and just have the enum options.
We're already looking at switching to "just strings" for loose iteration in v11, so there could be an argument that shipping faster loose behavior as a default while retaining the ability to opt into correctness could be a theme for v11.
Obviously very happy to discuss details on if we want to ship this, and if so, how :)