-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Don't call dbuf_evict_one from dbuf_evict_notify for kswapd #17561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like a concept of "reduce the chance of this deadlock" -- it there is a deadlock, it must be fixed reliably. Also exposing internal kitchen of ARC to a whole public for that it not a clean way ether. What I think should be investigated, is dbuf_evict_one()
calls from random dbuf_evict_notify()
contexts, since it is not strictly required and is only a performance optimization.
98bdc89
to
0b3c168
Compare
That makes sense to me. Ideally, I wanted the check higher up in in arc_buf_destroy to not block if the hashlock is locked, but at that point, we've already changed state in other method calls in the stack and clean up becomes a bit tricky...
Can you help me understand this a bit more? From what I can tell, dbuf_evict_notify() will wake up the eviction thread and call dbuf_evict_one() if the dbuf cache is at the high water mark. We also saw this eviction thread stuck trying to obtain the lock too - so just trying to find a way to unstuck these threads to not run into this in the future
|
Yes, the dbuf eviction thread may stuck until the hash lock is released, but it shouldn't block anything else, since as I see it is carefully written to not hold any locks at that point. And dbuf cache size should be small enough to not matter that much if it is not evicting for some time. |
Other options that we considered: Would appreciate thoughts on this: db_mtx lock is held when we check if the hash lock is owned. Is it possible between the time frame " of checking the hash lock and getting the hash_lock in arc_buf_destroy " for a new arc_read to get the hash_lock. If it can, then that is the open deadlock window. If removing calls to dbuf_evict from dbuf_rele_and_unlock (and any other similar places) is the option you are proposing - it sounds like a reasonable solution - we can explore that - at least it will prevent kswapd from getting stuck and hard hanging the system. |
The same ARC buffer can be used by several dbufs, and the same hash lock can be used by several ARC buffers. So the idea of checking and later taking makes little sense to me, since does not ensure anything.
That is not what I have said. Read again. What I am saying is that there are two ways to evict dbuf cache: the dedicated thread and any other helpers. The first one is required, the second are only a performance optimization, which you can skip if called from kswapd. |
0b3c168
to
8078c5c
Compare
@amotin pushed a new change based on your comments and updated the PR description - let me know what you think, thanks |
@khoang98 This is kind of what I've meant, except |
8078c5c
to
c248c07
Compare
@amotin updated to limit this change for Linux only |
c248c07
to
845722f
Compare
@amotin moved the check to dbuf.h to avoid the definition checking in c file |
845722f
to
8ff19b8
Compare
@amotin made the check more generic with OS-specific implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is the right way to do it.
4831fad
to
bec99b7
Compare
Avoid calling dbuf_evict_one() from memory reclaim contexts (e.g. Linux kswapd, FreeBSD pagedaemon). This prevents deadlock caused by reclaim threads waiting for the dbuf hash lock in the call sequence: dbuf_evict_one -> dbuf_destroy -> arc_buf_destroy Signed-off-by: Kaitlin Hoang <[email protected]>
bec99b7
to
dde05bd
Compare
Motivation and Context
In a high memory pressure environment using Lustre 2.15 + ZFS 2.1.7, we see a state where arc_read is holding the hash lock and waiting for a page of memory while kswapd is trying to evict a dbuf waiting on the same hash lock.
This change reduces the chance of this deadlock happening by checking if the hash lock is taken before trying to evict.This proposed change removes the call to dbuf_evict_one() specifically when it's running in the kswapd context. This change aims to reduce unnecessary work and hash lock contention during high memory pressure situations, while still maintaining the performance benefit of dbuf eviction in normal operational contexts.
Thread 1
Thread 2
Description
This change should help alleviate the blocking issue we're seeing with kswapd, as it prevents additional cache eviction work from being performed when the system is already under memory pressure.
How Has This Been Tested?
Tested running various IOR tests with Lustre file system and did not see any drop in performance.
Types of changes
Checklist:
Signed-off-by
.