Skip to content

Conversation

NHDaly
Copy link
Collaborator

@NHDaly NHDaly commented Feb 14, 2022

Implements a MultiThreadedCache{K,V}, and adds stress test.

This cache has no locks on access, and only has contention on a cache
miss. It only ever holds a shared lock for a constant-time duration,
never while executing user code. A Task requesting a key that is
already being computed on another Task will block while that computation
is being performed. By taking advantage of the append-only properties of
a cache, the cache can be duplicated per-Thread, to avoid locking in the
common case.

This PR adds:

  • implementation
  • tests
  • CI configuration

This cache has no locks on access, and only has contention on a cache
miss. It only ever holds a shared lock for a constant-time duration,
never while executing user code. A Task requesting a key that is
already being computed on another Task will block while that computation
is being performed. By taking advantage of the append-only properties of
a cache, the cache can be duplicated per-Thread, to avoid locking in the
common case.
@NHDaly
Copy link
Collaborator Author

NHDaly commented Feb 14, 2022

PR Review Request: @tveldhui, @vilterp, @Sacha0, @comnik ❤️

NHDaly and others added 4 commits February 14, 2022 12:53
Add constructor that provides pre-computed values to the base_cache
Rethrow the exception onto the Future for all blocked tasks, delete the
future, and rethrow on the current task.

The cache remains usable afterwards and nothing is recorded for the key
with the error.
The results are a bit lukewarm (glad we measured it), but look alright.

Here's a few runs, each with different JULIA_NUM_THREADS:
```julia
┌ Info: benchmark results
│   Threads.nthreads() = 1
│   time_serial = 0.013336288
│   time_parallel = 0.09071632
└   time_baseline = 0.115363526
```
```julia
┌ Info: benchmark results
│   Threads.nthreads() = 2
│   time_serial = 0.011262138
│   time_parallel = 0.097021203
└   time_baseline = 0.139031655
```
```julia
┌ Info: benchmark results
│   Threads.nthreads() = 20
│   time_serial = 0.011997677
│   time_parallel = 0.658225544
└   time_baseline = 1.032283809
```
```julia
┌ Info: benchmark results
│   Threads.nthreads() = 100
│   time_serial = 0.013902211
│   time_parallel = 1.999772731
└   time_baseline = 9.314424419
```

So it definitely does not scale as well as a single-threaded codebase
would have scaled. But it also definitely scales better than the
baseline, which is a Mutex around a Dict.

Maybe there will be some places to improve our contention in the future :)
Gracefully handle exceptions thrown during `get!()` functions
@NHDaly
Copy link
Collaborator Author

NHDaly commented Feb 15, 2022

Also i'm open to other package naming suggestions if anyone has them! :)

Copy link
Collaborator

@vilterp vilterp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple questions

NHDaly and others added 7 commits February 15, 2022 12:11
Add benchmark test measuring parallel scaling.
Fix lazy construction of Dicts, per guidance from Julia Base
...... In retrospect, this does make this whole structure start to look
a lot like a Dict + a Read/Write lock, and I wonder how their
performance would compare......
@NHDaly
Copy link
Collaborator Author

NHDaly commented Feb 17, 2022

After all the latest changes, especially #6 for the concurrency fixes, I think this package is good to go.

It still scales a good bit better than a Dict + Mutex, so I think there's still value in it!

If we want to give another approach a shot in the future, like a concurrent hash table, i'm super supportive. But hopefully this is useful in the interim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants