-
-
Notifications
You must be signed in to change notification settings - Fork 33.4k
Description
Hi all! Relatively niche feature idea, but will likely be useful for a lot of lower level state management libraries. I'm a little out of my expertise with the C++ implementation here, so please excuse me if I'm missing something that is already possible using existing APIs.
Context
Deep cloning objects in javascript is notoriously hard – and even harder if you want to to be performant. Luckily we now have the Structured Clone Algorithm as a native option.
The native V8 implementation was exposed in the Node.js runtime in [email protected]
as the Serialization API. Discussion happened in #6300
Problem
Using the V8 serialization API, Node.js apps can run a structured clone like so:
const v8 = require('v8');
const structuredClone = (o) => v8.deserialize(v8.serialize(o));
However, for large objects this is actually still fairly slow! I assume this is because calling v8.serialize/deserialize
shuttles data back and forth over the JS/C++ boundary twice. In my use case, I've found that a rather naive recursive clone function can actually out-perform it! Not ideal.
Instead, I've discovered that taking the rather roundabout method of leveraging MessageChannel
s can give me the native performance gains I'm expecting:
const { MessageChannel } = require('worker_threads');
function structuredClone (o) {
const { port1, port2 } = new MessageChannel();
return new Promise((resolve) => {
port2.on('message', resolve);
port2.on('close', port2.close);
port1.postMessage(o);
port1.close();
});
}
const clone = await structuredClone({ foo: 'bar' });
MessageChannel
also uses the structured clone algorithm to pass data from one port to the next. However, it runs faster than v8.serialize/deserialize
for this use case since it doesn't unnecessarily send data back and forth in order to clone the object.
Note: I have validated the performance differences in my current project, but neglected to take screenshots of the flame graphs! If there is interest in exploring this proposal I'm happy to come up with a contrived perf test comparing
v8.serialize/deserialize
, a simple recursive clone function, andMessageChannel
.
However, using MessageChannel
is also not ideal:
- It requires creating a new message channel for each clone, or maintaining a shared message channel with some method of discerning between cloned responses. This adds runtime overhead and code complexity.
- It forces us to use an async API for cloning objects. Often fine, but not ideal for some implementations.
- It requires a good degree of boilerplate just to access a native algorithm that the project already intends to expose.
Proposed Solution
Relatively simply, we can choose to expose a sync and async API for V8's structured clone:
const v8 = require('v8');
const syncClone = v8.structuredCloneSync({ foo: 'bar' });
const asyncClone = await v8.structuredClone({ foo: 'bar' });
This should out-perform both v8.serialize/deserialize
and MessageChannel
since it avoids the overhead of excess data shuttling and MessagePort
creation, while also enabling a fully synchronous API.
Alternatives
- Continue to use
MessageChannel
and publish as a user-land module, forgoing a synchronous, performant API - ???