mm: Unconditional per-VMA locks and cleanups#8040
Open
kernel-patches-daemon-bpf-rc[bot] wants to merge 5 commits into
Open
mm: Unconditional per-VMA locks and cleanups#8040kernel-patches-daemon-bpf-rc[bot] wants to merge 5 commits into
kernel-patches-daemon-bpf-rc[bot] wants to merge 5 commits into
Conversation
Author
|
Upstream branch: 2e8ad1f |
dcffd40 to
8b7f742
Compare
The per-VMA locks have been around for several years. They've had some bugs worked out of them and have seen quite wide use. However, they are still only available when architectures explicitly enable them. Remove the conditional compilation around the per-VMA locks, making them available on all architectures and configs. The approach up to now seemed to be to add ARCH_SUPPORTS_PER_VMA_LOCK when the architecture started using per-VMA locks in the fault handler. But, contrary to the naming, the Kconfig option does not really indicate whether the architecture supports per-VMA locks or not. It is more of a marker for whether the architecture is likely to benefit from per-VMA locks. To me, the most important thing side-effect of universal availability is letting per-VMA locks be used in SMP=n configs. This lets us use per-VMA locking in all x86 code without fallbacks. Overall, this just generally makes the kernel simpler. Just look at the diffstat. It also opens the door to users that want to use the per-VMA locks in common code. Doing *that* brings additional simplifications. The downside of this is adding some fields to vm_area_struct and mm_struct. There are likely ways to optimize this, especially for things like SMP=n configs. For now, do the simplest thing: use the same implementation everywhere. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: linux-mm@kvack.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arve Hjønnevåg <arve@android.com> Cc: Todd Kjos <tkjos@android.com> Cc: Christian Brauner <christian@brauner.io> Cc: Carlos Llamas <cmllamas@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Ahern <dsahern@kernel.org> Cc: netdev@vger.kernel.org
tl;dr: lock_vma_under_rcu() is already a trylock. No need to do both it and mmap_read_trylock(). Long Version: == Background == Historically, binder used an mmap_read_trylock() in its shrinker code. This ensures that reclaim is not blocked on an mmap_lock. Commit 95bc2d4 ("binder: use per-vma lock in page reclaiming") added support for the per-VMA lock, but left mmap_read_trylock() as a fallback. This was presumably because the per-VMA locking can fail for several reasons and most (all?) lock_vma_under_rcu() callers have a fallback to mmap_read_trylock(). == Problem == The fallback is not worth the complexity here. lock_vma_under_rcu() is essentially already a non-blocking trylock. The main reason it fails is also the reason mmap_read_trylock() fails: something is holding mmap_write_lock(). The only remedy for a collision with mmap_write_lock() is to wait, which this code can not do. So the "fallback" after lock_vma_under_rcu() failure is not really a fallback: it is really likely to just be retrying in vain. That retry in an of itself isn't horrible. But it adds complexity. == Solution == Now that per-VMA locks are universally available, lock_vma_under_rcu() will not persistently fail. Rely on it alone and simplify the code. Full disclosure: I originally tried to do this with lock_vma_under_rcu_wait(), but it did not fit well with the mmap_lock trylock semantics. Claude caught this in a review and suggested the approach in this path. It seemed sane to me. So, Suggesed-by: Claude, I guess. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Lorenzo Stoakes <ljs@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: linux-mm@kvack.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arve Hjønnevåg <arve@android.com> Cc: Todd Kjos <tkjos@android.com> Cc: Christian Brauner <christian@brauner.io> Cc: Carlos Llamas <cmllamas@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Ahern <dsahern@kernel.org> Cc: netdev@vger.kernel.org
== Background ==
There are basically two parallel ways to look up a VMA: the
traditional way, which is protected by mmap_lock, and the RCU-based
per-VMA lock way which is based on RCU and refcounts.
== Problem ==
The mmap_lock one is more straightforward to use but it has a big
disadvantage in that it can not be mixed with page faults since those
can take mmap_lock for read, which can deadlock when mixed with page
faults. For example:
mmap_read_lock(mm);
// Another thread does mmap_write_lock().
// New mmap_lock readers are blocked.
vma = vma_lookup(mm, address);
// This deadlocks on mmap_read_lock() if it faults:
copy_from_user(address);
mmap_read_unlock(mm);
The RCU one can be mixed with faults, but it is not available in all
configs, so all RCU users need to be able to fall back to the
traditional way.
== Solution ==
Add a variant of the RCU-based lookup that waits for writers. This is
basically the same as the existing RCU-based lookup, but it also takes
mmap_lock for read and waits for writers to finish before returning
the VMA. This has some advantages:
1. Callers do not need to have a fallback path for when they
collide with writers.
2. It can be used in contexts where page faults can happen because
it can take the mmap_lock for read but never *holds* it.
3. Its fast path does not require taking mmap_lock for read.
Basically, when applied correctly, this approach results in faster
*and* simpler code.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: linux-mm@kvack.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Ahern <dsahern@kernel.org>
Cc: netdev@vger.kernel.org
Previously, the per-VMA locking could fail in the face of writers which necessitate a fallback to mmap_lock. The new vma_start_read_unlocked() will wait for writers instead of failing. Use the new helper. Wait for writers. Remove the fallback to mmap_lock. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Lorenzo Stoakes <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: linux-mm@kvack.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arve Hjønnevåg <arve@android.com> Cc: Todd Kjos <tkjos@android.com> Cc: Christian Brauner <christian@brauner.io> Cc: Carlos Llamas <cmllamas@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Ahern <dsahern@kernel.org> Cc: netdev@vger.kernel.org
Author
|
Upstream branch: 30dee2c |
Previously, the per-VMA locking could fail in the face of writers which necessitates a fallback to mmap_lock. The new lock_vma_under_rcu_wait() will wait for writers instead of failing. Use the new helper. Wait for writers. Remove the fallback to mmap_lock. This really is a nice cleanup. It removes the need to pass the lock state back and forth to find_tcp_vma(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Lorenzo Stoakes <ljs@kernel.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: linux-mm@kvack.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arve Hjønnevåg <arve@android.com> Cc: Todd Kjos <tkjos@android.com> Cc: Christian Brauner <christian@brauner.io> Cc: Carlos Llamas <cmllamas@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Ahern <dsahern@kernel.org> Cc: netdev@vger.kernel.org
2d1b7fd to
1000f78
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: mm: Unconditional per-VMA locks and cleanups
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1109563