Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions tools/doc/html.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,7 @@ function linkJsTypeDocs(text) {
}

const isJSFlavorSnippet = (node) => node.lang === 'cjs' || node.lang === 'mjs';
const isPreSnippet = (node) => node.lang === 'pre';

// Preprocess headers, stability blockquotes, and YAML blocks.
export function preprocessElements({ filename }) {
Expand Down Expand Up @@ -265,6 +266,8 @@ export function preprocessElements({ filename }) {
// Isolated JS snippet, no need to add the checkbox.
node.value = `<pre>${highlighted} ${copyButton}</pre>`;
}
} else if (isPreSnippet(node)) {
node.value = `<pre>${node.value}</pre>`;
} else {
node.value = `<pre>${highlighted} ${copyButton}</pre>`;
}
Expand Down
9 changes: 9 additions & 0 deletions tools/doc/markdown.mjs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import { visit } from 'unist-util-visit';

export const referenceToLocalMdFile = /^(?![+a-z]+:)([^#?]+)\.md(#.+)?$/i;
export const referenceToLocalMdFileInPre = /<a href="([^#"]+)\.md(#[^"]+)?">/g;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an odd way of using the capture groups also as a way to only run the replace if such capture groups match... This regex seems a bit expensive, IMO...

Since you're using AST (visit), wouldn't eventually the href elements also be looped? I wonder if a simple check of if the node lang tag is "a" when it is iterating over it and then checking if the href contains a .md; There's also the issue that this current regex will also match external links ending with .md 🤷 and that's definitely a no-go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an odd way of using the capture groups also as a way to only run the replace if such capture groups match

I don't follow this comment (I'm not a JavaScript developer, so it's likely I'm missing something). Is there a better way to do this? I was using the constant right above as a guide.

Since you're using AST (visit), wouldn't eventually the href elements also be looped? I wonder if a simple check of if the node lang tag is "a" when it is iterating over it and then checking if the href contains a .md

It looks like mdast can visit HTML nodes (https://github.com/syntax-tree/mdast?tab=readme-ov-file#html), of which there are roughly 4500 (mostly comments), but not the anchor tag specifically. I only see eight occurrences of HTML containing <a href, four of which are the ones I'm targeting, so it seems manageable to rewrite those. I like that approach better, since I won't have to introduce a new language tag or change that <pre> tag (though it does mean that this won't work if someone uses such a link in a code fence).

There's also the issue that this current regex will also match external links ending with .md 🤷 and that's definitely a no-go.

Yeah, good catch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like mdast can visit HTML nodes (syntax-tree/mdast#html), of which there are roughly 4500 (mostly comments), but not the anchor tag specifically. I only see eight occurrences of HTML containing <a href, four of which are the ones I'm targeting, so it seems manageable to rewrite those. I like that approach better, since I won't have to introduce a new language tag or change that

 tag (though it does mean that this won't work if someone uses such a link in a code fence).

That's genuinely odd (the fact it is not going through anchor elements 🤷)

I don't follow this comment (I'm not a JavaScript developer, so it's likely I'm missing something). Is there a better way to do this? I was using the constant right above as a guide.

Regular Expressions are not a JavaScript thing. My point here is that you're running that regex through all the node.values with .replaceAll, which of course will do nothing if there are no matches, but I'm not sure this is the best of achieving this (performance-wise at least 🤔)

It worries me this whole tooling has no unit testing at all, and such regexe's could create unattended effects, is not like you've manually tested every link... (or have you?) the Regex is pretty clear, but I'd still probably add an if statement to verify if the string matches the existence of such anchor links and then running the replacement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point here is that you're running that regex through all the node.values with .replaceAll, which of course will do nothing if there are no matches, but I'm not sure this is the best of achieving this (performance-wise at least 🤔)

Gotcha. The current implementation only runs over a single node, since that's the only one that matches pre html, and as a result I saw a 2‰ increase in the time taken to generate the docs. With the new implementation, it will run over ~4500 elements. I haven't implemented or tested that yet to measure the performance impact.

is not like you've manually tested every link... (or have you?)

I have. I put the results in the description for #52883 (though, that also includes the changes from f125d7c). I guess I forgot to include that context here. Once I do the new implementation, I'll be sure to post the results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wonder what would be more perfomatic:

  • Multiple iterations on small nodes
  • One big string search/replacement on the whole HTML

Feel free to do some testing here, but otherwise, I'm fine with this change :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @crawford any updates here? 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No updates, but I'll have time to work on this in two weeks. I'm fine if you want to close this and let me reopen with my update.


export function replaceLinks({ filename, linksMapper }) {
return (tree) => {
Expand All @@ -14,6 +15,14 @@ export function replaceLinks({ filename, linksMapper }) {
);
}
});
visit(tree, 'code', (node) => {
if (node.meta === 'html') {
node.value = node.value.replace(
referenceToLocalMdFileInPre,
(_, path, fragment) => `<a href="${path}.html${fragment || ''}">`,
);
}
});
visit(tree, 'definition', (node) => {
const htmlUrl = fileHtmlUrls && fileHtmlUrls[node.identifier];

Expand Down
1 change: 1 addition & 0 deletions tools/lint-md/lint-md.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -23409,6 +23409,7 @@ const plugins = [
"markdown",
"mjs",
"powershell",
"pre",
"r",
"text",
"ts",
Expand Down