Improve performance by removing SHA

The performance could be faster.

When I profile my blog build, about 60% of the time spent in markdown2.py is spent running SHA hashes.

Could you explain the logic of `_hash_html_block_sub` and `_hash_text` generally? Why are we even running SHA inside a markdown converter?

It looks like... this is some some of escape mechanism, maybe? Like we generate a key, replace the HTML with the key (so it doesn't look like HTML to some other stage of the parser that should ignore it), do some processing on the outer HTML, and finally replace all the keys with the original HTML?

That could be served just as well by generating a random string rather than a hash, if so?

Ex.

```
def _hash_text(s: str) -> str:
    'md5-' + sha256(SECRET_SALT + s.encode("utf-8")).hexdigest()[32:]
```

could be replaced by the much faster

```
hex_digits = "0123456789abcdef"
def _hash_text(s: str) -> str:
    'md5-' + ''.join(random.choice(hex_digits) for _ in range(32)) 
```

for a quick fix.

(Estimate says this will make markdown conversion 2.5X faster)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance by removing SHA #618

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve performance by removing SHA #618

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions