Skip to content

Discuss: Expose Structured Clone on v8 Module #34355

@amiller-gh

Description

@amiller-gh

Hi all! Relatively niche feature idea, but will likely be useful for a lot of lower level state management libraries. I'm a little out of my expertise with the C++ implementation here, so please excuse me if I'm missing something that is already possible using existing APIs.

Context

Deep cloning objects in javascript is notoriously hard – and even harder if you want to to be performant. Luckily we now have the Structured Clone Algorithm as a native option.

The native V8 implementation was exposed in the Node.js runtime in [email protected] as the Serialization API. Discussion happened in #6300

Problem

Using the V8 serialization API, Node.js apps can run a structured clone like so:

const v8 = require('v8');
const structuredClone = (o) => v8.deserialize(v8.serialize(o));

However, for large objects this is actually still fairly slow! I assume this is because calling v8.serialize/deserialize shuttles data back and forth over the JS/C++ boundary twice. In my use case, I've found that a rather naive recursive clone function can actually out-perform it! Not ideal.

Instead, I've discovered that taking the rather roundabout method of leveraging MessageChannels can give me the native performance gains I'm expecting:

const { MessageChannel } = require('worker_threads');
function structuredClone (o) {
  const { port1, port2 } = new MessageChannel();
  return new Promise((resolve) => {
    port2.on('message', resolve);
    port2.on('close', port2.close);
    port1.postMessage(o);
    port1.close();
  });
}
const clone = await structuredClone({ foo: 'bar' });

MessageChannel also uses the structured clone algorithm to pass data from one port to the next. However, it runs faster than v8.serialize/deserialize for this use case since it doesn't unnecessarily send data back and forth in order to clone the object.

Note: I have validated the performance differences in my current project, but neglected to take screenshots of the flame graphs! If there is interest in exploring this proposal I'm happy to come up with a contrived perf test comparing v8.serialize/deserialize, a simple recursive clone function, and MessageChannel.

However, using MessageChannel is also not ideal:

  1. It requires creating a new message channel for each clone, or maintaining a shared message channel with some method of discerning between cloned responses. This adds runtime overhead and code complexity.
  2. It forces us to use an async API for cloning objects. Often fine, but not ideal for some implementations.
  3. It requires a good degree of boilerplate just to access a native algorithm that the project already intends to expose.

Proposed Solution

Relatively simply, we can choose to expose a sync and async API for V8's structured clone:

const v8 = require('v8');
const syncClone = v8.structuredCloneSync({ foo: 'bar' });
const asyncClone = await v8.structuredClone({ foo: 'bar' });

This should out-perform both v8.serialize/deserialize and MessageChannel since it avoids the overhead of excess data shuttling and MessagePort creation, while also enabling a fully synchronous API.

Alternatives

  • Continue to use MessageChannel and publish as a user-land module, forgoing a synchronous, performant API
  • ???

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions