Alloc
uses per-thread context
AllocLHeap
uses per-heap context.
We probably should use only one way. Using per-heap context consistently is attractive, since we can make allocation contexts smaller by removing alloc_bytes_loh
However, consider GetAllocatedBytesForCurrentThread
API - does it actually see allocations done via AllocLHeap
? Should it?