-
Notifications
You must be signed in to change notification settings - Fork 440
Description
The performance could be faster.
When I profile my blog build, about 60% of the time spent in markdown2.py is spent running SHA hashes.
Could you explain the logic of _hash_html_block_sub
and _hash_text
generally? Why are we even running SHA inside a markdown converter?
It looks like... this is some some of escape mechanism, maybe? Like we generate a key, replace the HTML with the key (so it doesn't look like HTML to some other stage of the parser that should ignore it), do some processing on the outer HTML, and finally replace all the keys with the original HTML?
That could be served just as well by generating a random string rather than a hash, if so?
Ex.
def _hash_text(s: str) -> str:
'md5-' + sha256(SECRET_SALT + s.encode("utf-8")).hexdigest()[32:]
could be replaced by the much faster
hex_digits = "0123456789abcdef"
def _hash_text(s: str) -> str:
'md5-' + ''.join(random.choice(hex_digits) for _ in range(32))
for a quick fix.
(Estimate says this will make markdown conversion 2.5X faster)