My understanding from the HLL algorithm (which may be flawed, in which case please correct me and close this issue) is that for any fixed set of input values, the accuracy of any estimate from an HLL built from those values should increase as the "m" value used in the HLL increases.
Ie:
if you build 2 HLL instances, with different log2m settings, and add the exact same set of (raw) values to both, then the HLL with the larger log2m will give you the most accurate results then the HLL with a smaller log2m setting.
In my testing however, I'm frequently encountering situations where "smaller" HLL instances are producing more accurate cardinality estimates -- which I can't explain.
I've created a reproducible test case that demonstrates the problem, which i will post as a separate comment.