-
Notifications
You must be signed in to change notification settings - Fork 186
Description
I was curious if Set.fromDistinctAscList could be written to fuse with the input list, so I gave it a shot. Currently it looks like:
containers/containers/src/Data/Set/Internal.hs
Lines 1175 to 1190 in 67752b2
| fromDistinctAscList :: [a] -> Set a | |
| fromDistinctAscList [] = Tip | |
| fromDistinctAscList (x0 : xs0) = go (1::Int) (Bin 1 x0 Tip Tip) xs0 | |
| where | |
| go !_ t [] = t | |
| go s l (x : xs) = case create s xs of | |
| (r :*: ys) -> let !t' = link x l r | |
| in go (s `shiftL` 1) t' ys | |
| create !_ [] = (Tip :*: []) | |
| create s xs@(x : xs') | |
| | s == 1 = (Bin 1 x Tip Tip :*: xs') | |
| | otherwise = case create (s `shiftR` 1) xs of | |
| res@(_ :*: []) -> res | |
| (l :*: (y:ys)) -> case create (s `shiftR` 1) ys of | |
| (r :*: zs) -> (link y l r :*: zs) |
And here's what I got:
data SetPart a
= PartL !Int !(Set a)
| PartLM !Int !(Set a) !a
fromDistinctAscList :: [a] -> Set a
fromDistinctAscList = mergeParts . List.foldl' f []
where
f (PartL h l : parts) !x = PartLM h l x : parts
f parts0 x0 = mergeInto 0 (Bin 1 x0 Tip Tip) parts0
where
mergeInto h !r (PartLM h' l x : parts)
| h+1 == h' = mergeInto h' (link x l r) parts
mergeInto h l parts = PartL (h+1) l : parts
mergeParts = List.foldl' f' Tip where
f' r (PartL _ l) = merge l r
f' r (PartLM _ l x) = link x l r
{-# INLINE fromDistinctAscList #-}The idea is that we keep a stack of partially constructed sets as we go along the list, and merge them whenever we get the chance.
We can do a similar thing for Map too.
Now how does it compare to the original definition? Let's benchmark.
-- in benchmarks/Set.hs
, bench "fromDistinctAscList" $ whnf S.fromDistinctAscList elems -- elems = [1..2^12]
, bench "fromDistinctAscList2" $ whnf (\n -> S.fromDistinctAscList [1..n]) (2^12 :: Int) -- To test with fusionWith GHC 9.2.5:
Current:
fromDistinctAscList: OK (0.15s)
37.9 μs ± 3.1 μs, 159 KB allocated, 3.1 KB copied, 7.0 MB peak memory
fromDistinctAscList2: OK (0.12s)
58.7 μs ± 5.8 μs, 448 KB allocated, 12 KB copied, 7.0 MB peak memory
New:
fromDistinctAscList: OK (0.16s)
39.8 μs ± 3.1 μs, 263 KB allocated, 5.2 KB copied, 7.0 MB peak memory
fromDistinctAscList2: OK (0.15s)
34.6 μs ± 2.9 μs, 327 KB allocated, 8.4 KB copied, 7.0 MB peak memory
It's a lot better in the second case because it doesn't construct the list. In the first case, the time doesn't change but it does allocate more, so it's not a clear win.
So, what do you think about this definition? Is it worth changing?
I would guess fromDistinctAscList [a..b] is a common usage and would benefit from this change.
As an aside, I want to try the same thing with fromList, but this seemed simpler to try first.