Skip to content

Illegal characters in URL do not raise an exception #41976

@GuillaumeBlanchet

Description

@GuillaumeBlanchet

Version

v17.5.0

Platform

Linux deskt 5.13.0-28-generic #31~20.04.1-Ubuntu SMP Wed Jan 19 14:08:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

No response

What steps will reproduce the bug?

Open a terminal and enter the following commands (first is optional if you have already node on your system):
$ docker run -i -t node:17.5 bash
:/# node -i
> new URL('http://sub_domain.domain.tld');

How often does it reproduce? Is there a required condition?

Happens all the time

What is the expected behavior?

I would expect to have an output like this:

Uncaught TypeError [ERR_INVALID_URL]: Invalid URL
at __node_internal_captureLargerStackTrace (node:internal/errors)
at new NodeError (node:internal/errors)
at onParseError (node:internal/url)
at new URL (node:internal/url) {
input: 'http://sub_domain.domain.tld',
code: 'ERR_INVALID_URL'
}

Since all ASCII characters except A-Z, a-z, 0-9 and U+002D ( - ) are illegals in domain names, see the RFC about LDH names (Letters, Digits, and Hyphens)

What do you see instead?

URL {
href: 'http://sub_domain.domain.tld/',
origin: 'http://sub_domain.domain.tld',
protocol: 'http:',
username: '',
password: '',
host: 'sub_domain.domain.tld',
hostname: 'sub_domain.domain.tld',
port: '',
pathname: '/',
search: '',
searchParams: URLSearchParams {},
hash: ''
}

Additional information

> new URL('http://sub+domain.domain.tld');
> new URL('http://sub&domain.domain.tld');
> new URL('http://sub?domain.domain.tld');

and so on, produce the same behavior, violating LDH rule for domain names. Please consider instantiate ICU always with the option: UIDNA_USE_STD3_RULES at line

if (mode == IDNA_STRICT) {
.

As a side note, the proper way to instantiate the ICU lib to support IDNA2008 is the following:

uint32_t options =                  // CheckHyphens = false; handled later
    UIDNA_CHECK_BIDI |                // CheckBidi = true
    UIDNA_CHECK_CONTEXTJ |            // CheckJoiners = true
    UIDNA_CHECK_CONTEXTO |  
    UIDNA_NONTRANSITIONAL_TO_ASCII |   // Nontransitional_Processing
    UIDNA_NONTRANSITIONAL_TO_UNICODE |
    UIDNA_USE_STD3_RULES);
  UIDNA* uidna = uidna_openUTS46(options, &status);

Metadata

Metadata

Assignees

No one assigned

    Labels

    whatwg-urlIssues and PRs related to the WHATWG URL implementation.wrong repoIssues that should be opened in another repository.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions