Skip to content

Can't parse JSON response exceeding 512 MB #4295

@the-sun-will-rise-tomorrow

Description

Bug Description

When using fetch and .json() to fetch and parse a response whose body length exceeds 512 MB, Node throws "Cannot create a string longer than 0x1fffffe8 characters" (ERR_STRING_TOO_LONG).

Note that this happens even when no individual string inside the JSON value is long - the limit applies to the size of the entire response body.

Reproducible By

  1. Create a 1GB JSON file:

    $ jq -nc '[range(1024 * 1024) | [[range(102) | "abcdefghij"] | add]] | add' > 1gb.json
  2. Start an HTTP server serving the file:

    $ npx http-server
  3. Try to fetch the file:

    fetch("http://127.0.0.1:8080/1gb.json")
        .then((response) => response.json())
        .then((data) => {
     	   console.log(data.length);
        })
        .catch((error) => {
     	   console.error(error);
        });

Expected Behavior

Ideally, the JSON response should be parsed as it is received, without being collected into a single buffer or string first, thus avoiding any Node string limits.

Logs & Screenshots

Above steps fail with:

Error: Cannot create a string longer than 0x1fffffe8 characters
    at TextDecoder.decode (node:internal/encoding:447:16)
    at utf8DecodeBytes (node:internal/deps/undici/undici:4863:34)
    at parseJSONFromBytes (node:internal/deps/undici/undici:5738:25)
    at successSteps (node:internal/deps/undici/undici:5719:27)
    at fullyReadBody (node:internal/deps/undici/undici:4609:9)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async consumeBody (node:internal/deps/undici/undici:5728:7) {
  code: 'ERR_STRING_TOO_LONG'
}

Environment

Node v22.14.0 on NixOS (Linux)

Additional context

The error happens because the response is first collected into a buffer, then decoded into an UTF-8 string, and then parsed from this single string using JSON.parse:

/**
* @see https://infra.spec.whatwg.org/#parse-json-bytes-to-a-javascript-value
* @param {Uint8Array} bytes
*/
function parseJSONFromBytes (bytes) {
return JSON.parse(utf8DecodeBytes(bytes))
}

Ideally, we would be using a streaming JSON parser, which can decode and parse the response body as it's being received. This should not only avoid any string length limitations, but also improve performance for large bodies, as parsing can happen in parallel to waiting for network traffic, thus avoiding a separate parsing step after the download finishes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions