Headers are unnecessary encoded to `ByteString` if there is no `Transfer-Encoding: chunked` header

### Version

v17.5.0

### Platform

Windows 10

### Subsystem

http

### What steps will reproduce the bug?

Run this `http` server:
```js
import http from "http";

const name = `Rock & roll 音楽 («🎵🎶»).txt`; // Also is used as the response body

const defaultCdHeader = getContentDisposition(name);
// console.log(defaultCdHeader === Buffer.from(`inline; filename="${name}"`).toString("binary")); // true

const host = "localhost";
const port = 8000;
const origin = `http://${host}:${port}`;
const server = http.createServer(requestListener);
server.listen(port, host, () => {
    console.log("Server is running on:     " + origin + "       (open the web page)");
    console.log("Downloads the file:       " + origin + "/?dl=1");
    console.log("Remove Transfer-Encoding: " + origin + "/?te=0" + " (open the web page)");
    console.log("Downloads w/o T-E header: " + origin + "/?dl=1&te=0");
});

function requestListener(req, res) {
    let cdHeader = defaultCdHeader;
    const {dl, te} = Object.fromEntries(new URL(origin + req.url).searchParams.entries());
    if (dl === "1") {
        cdHeader = getContentDisposition(name, {type: "attachment"});
    }

    res.setHeader("Content-Disposition", cdHeader);
    // The header must be a `ByteString`. For example, I can't do that:
    // res.setHeader("Header-X", name); // TypeError [ERR_INVALID_CHAR]: Invalid character in header content ["Header-X"]

    if (te === "0") { // Note: "Transfer-Encoding: chunked" is set by default.
        res.removeHeader("Transfer-Encoding"); // Any other TE header's value has the same effect as the header removing
        const byteCount = new TextEncoder().encode(name).length;
        res.setHeader("Content-Length", byteCount);
    }
    res.setHeader("Content-Type", "text/html; charset=utf-8");
    res.writeHead(200);
    res.end(name);
}

// Note: In case of this issue you can ignore the follow function code. It just returns C-D header as `ByteString`.
/**
 * Simple implementation for getting "Content-Disposition" header from filename
 * @example
 * // By default, it produces the same result as it (with replacing all double-quotes by underscore):
 * Buffer.from(`inline; filename="${name}"`).toString("binary")
 *
 * @param {string} name
 * @param {Object} opts
 * @param {"inline"|"attachment"} [opts.type="inline"]
 * @param {Boolean} [opts.encoded=false]
 * @param {Boolean} [opts.filename=true]
 * @param {String} [opts.replacer="_"]
 * @return {string}
 */
function getContentDisposition(name, opts = {}) {
    const {
        type, encoded, filename, replacer
    } = Object.assign({type: "inline", encoded: false, filename: true, replacer: "_"}, opts);
    const fixName = (name) => name.replaceAll(`"`, replacer); // The most trivial fix, since it uses the quoted filename. Prevents from the incorrect header parsing. For, example: `";"` (3 chars in ` quotes).
    const encodeMap = new Map([["'", "%27"], ["(", "%28"], [")", "%29"], ["*", "%30"]]); // Required to escape it for old browsers (For example, a Chromium of 2013)
    const getEncodedName = (name) => encodeURIComponent(name).replaceAll(/['()*]/g, (val) => encodeMap.get(val));
    const encodedStr = encoded ? `filename*=UTF-8''${getEncodedName(name)}` : "";
    const filenameStr = filename ? `filename="${fixName(name)}"` : "";
    const header = [type, filenameStr, encodedStr].filter(x => x).join("; ");
    return Buffer.from(header).toString("binary");
}
```

1. Click on the link http://localhost:8000/?dl=1 to download a file with `Rock & roll 音楽 («🎵🎶»).txt` name.
2. Click on the link http://localhost:8000/?dl=1&te=0 to download a file with `Rock & roll 音楽 («🎵🎶»).txt` name (in this case "Transfer-Encoding" header will be removed)

### How often does it reproduce? Is there a required condition?

Always.

### What is the expected behavior?

Both files are downloaded with `Rock & roll 音楽 («🎵🎶»).txt` name.

### What do you see instead?

The first file has the correct name — `Rock & roll 音楽 («🎵🎶»).txt`.
The second one has wrong name — `Rock & roll é_³æ¥½ (Â«ð__µð__¶Â»).txt`

### Additional information

## TL;DR

If there is `"Transfer-Encoding: chunked"` header (exactly `chunked`) `setHeader` works properly, it sets the input header (`ByteString`) as is.
(_Note: `"Transfer-Encoding: chunked"` is set by default._)

In any other case it additionally (_unnecessary_) encodes the header to `ByteString`. 
So, the header is encoded twice, that is wrong.

## Additional info

The most of HTTP headers are contains only ASCII characters. But when you need to put in a header (For example, `"Content-Disposition"`, or any custom header) a string that contains non-ASCII* character(s), you can't just put it in as is[`setHeader`](https://nodejs.org/api/http.html#requestsetheadername-value).
For example:
```js
// TypeError [ERR_INVALID_CHAR]: Invalid character in header content ["Header-X"]
res.setHeader("Content-Disposition", `attachment; filename="Rock & roll 音楽 («🎵🎶»).txt"`);
```

A HTTP header is a [Binary String](https://developer.mozilla.org/en-US/docs/Web/API/DOMString/Binary) ([`ByteString`](https://webidl.spec.whatwg.org/#idl-ByteString)) — `UTF-8` bytes within `String` object.*

There is no problem with the headers which contain only ASCII characters, since ASCII charset is subset of `UTF-8` and `Latin 1` encodings, so `toByteString(ASCIIString) === ASCIIString`.

To get a `ByteString` from [`USVString`](https://webidl.spec.whatwg.org/#idl-USVString) you just need to take `UTF-8` bytes from an input string then represent them in `Latin 1` ([`ISO-8859-1`](https://en.wikipedia.org/wiki/ISO/IEC_8859-1)) encoding. 

For example, in [Node.js](https://nodejs.org/api/buffer.html#buffers-and-character-encodings):
```js
function toByteString(string) {
    return Buffer.from(string).toString("binary"); // or "latin1"
}
```

---

\*To be honest, the entire quote of [`ByteString`](https://webidl.spec.whatwg.org/#idl-ByteString:
> Such sequences might be interpreted as UTF-8 encoded strings [[RFC3629]](https://webidl.spec.whatwg.org/#biblio-rfc3629) or strings in _some other 8-bit-per-code-unit encoding, **although this is not required**._

As I can see, a browser also can detect if the string is "just" [`8859-1`](https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Code_page_layout), not `UTF-8` bytes encoded in `8859-1`.

So, both `"Content-Disposition"` headers are ~valid~ "valid"**:
```js
res.setHeader("Content-Disposition", `attachment; filename="¡«£»÷ÿ.png"`); // Correct ONLY for the some certain browser's languages
res.setHeader("Content-Disposition", `attachment; filename="${toByteString("¡«£»÷ÿ.png")}"`); // Always is correct
```
The result in ~both~ "both"** cases is a file with `"¡«£»÷ÿ.png"`** name, even while `"¡«£»÷ÿ.png" !== toByteString("¡«£»÷ÿ.png")`.

**UPDATE:**
\*\*Using non-UTF-8 bytes ("some other 8-bit-per-code-unit encoding") in `ByteString` is **browser/OS language dependent!**

For example, in Firefox with non-EN language using of `"¡«£»÷ÿ.png"` _as is_ (without `toByteString()`) results to `������.png` filename, instead of `¡«£»÷ÿ.png`
In Chrome it will be `Ў«Ј»чя.png` for Cyrillic.

So, I think it (using of `8859-1` in "usual way") should be highly **un**recommended.
Headers should always be a `ByteString` _with only UTF-8 bytes_ represented as `8859-1` (`Latin 1`).


## Problem

The problem is that I can't correctly set a header that is a `ByteString` (`UTF-8` bytes in `Latin 1`) if the original string contains non-ASCII characters.
Like the other servers do it.

The problem appears only when the `Transfer-Encoding: chunked` header (which is present by default) is removed (or changed).
In this case `setHeader` encodes the header to Binary String.

That is unnecessary, since it's already a `ByteString`.

It's not possible to put in `setHeader` a `USVString`, since in this case it will throw `TypeError [ERR_INVALID_CHAR]: Invalid character in header content` error.

So, the header is encoded to `"binary"` **twice**, and browsers download the file with the wrong filenames:
`Rock & roll é_³æ¥½ (Â«ð__µð__¶Â»).txt` instead of `Rock & roll 音楽 («🎵🎶»).txt`.

You can open the demo server with disabled `Transfer-Encoding: chunked` header (http://localhost:8000/?te=0 ) and check it:

```js
function bSrt2Str(bString) {
    return new TextDecoder().decode(binaryStringToArrayBuffer(bString));
}
function binaryStringToArrayBuffer(binaryString) {
    const u8Array = new Uint8Array(binaryString.length);
    for (let i = 0; i < binaryString.length; i++) {
        u8Array[i] = binaryString.charCodeAt(i);
    }
    return u8Array;
}
let cd = (await fetch("")).headers.get("Content-Disposition");
console.log(cd);
console.log(bSrt2Str(cd));
console.log(bSrt2Str(bSrt2Str(cd)));
```

![image](https://user-images.githubusercontent.com/16310547/161396329-70cda5ad-1d44-4d19-9d18-73abc8a460a3.png)

The header is encoded twice!

---

## Examples

A lot of forums encodes headers such way for the attached files (XenForo, vBulletin, for example).

The real life examples:
- https://xenforo.com/community/attachments/rock-roll-音楽-«🎵🎶»-png.266784/
- https://xenforo.com/community/attachments/¡«£»ÿ-png.266785/

_Oh, wait, it requires an account, if you don't have/(want to create an [account](https://xenforo.com/community/)), just use my demo server.
Anyway, just look at the screenshots below._

In the browser console you can verify that header are `ByteString`:
```js
function bSrt2Str(bString) {
    return new TextDecoder().decode(binaryStringToArrayBuffer(bString));
}
function binaryStringToArrayBuffer(binaryString) {
    const u8Array = new Uint8Array(binaryString.length);
    for (let i = 0; i < binaryString.length; i++) {
        u8Array[i] = binaryString.charCodeAt(i);
    }
    return u8Array;
}
let cd = (await fetch("")).headers.get("Content-Disposition");
console.log(cd);
console.log(bSrt2Str(cd));
```

![image](https://user-images.githubusercontent.com/16310547/161395415-ab27258c-f84d-4efa-8440-645f21c99bc3.png)

![image](https://user-images.githubusercontent.com/16310547/161395284-d319217b-b378-4db2-ba37-b80438c252ab.png)

---

As a bonus, here is an example of Java server made with `ServerSocket` which also works properly:

<details>
<summary>Main.java</summary>
  
```java
package com.company;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.charset.StandardCharsets;

public class Main {
    public static void main(String[] args) {
        try (ServerSocket serverSocket = new ServerSocket(8000)) {
            System.out.println("Sever run on http://127.0.0.1:8000");
            while (true) {
                Socket socket = serverSocket.accept();
                try (BufferedReader input = new BufferedReader(new InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8));
                     PrintWriter output = new PrintWriter(socket.getOutputStream())) {
                    while (input.ready()) { // Print headers
                        System.out.println(input.readLine());
                    }
                    String name = "Rock & roll 音楽 («\uD83C\uDFB5\uD83C\uDFB6»).txt";
                    System.out.println(name);

                    output.println("HTTP/1.1 200 OK");
                    output.println("Content-Type: text/html; charset=utf-8");
                    output.println("Content-Disposition: attachment; filename=\"" + name + "\"");
                    output.println();
                    output.println(name);
                    output.flush();
                }
            }
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }
}
```  
</details>



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Headers are unnecessary encoded to `ByteString` if there is no `Transfer-Encoding: chunked` header #42579

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Additional information

TL;DR

Additional info

Problem

Examples

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Headers are unnecessary encoded to ByteString if there is no Transfer-Encoding: chunked header #42579

Description

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Additional information

TL;DR

Additional info

Problem

Examples

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Headers are unnecessary encoded to `ByteString` if there is no `Transfer-Encoding: chunked` header #42579