Skip to content

Conversation

@puranjaymohan
Copy link
Contributor

This set allows arena kfuncs to be called from non-sleepable contexts.

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 2 times, most recently from 5125528 to 1e2d874 Compare October 30, 2025 01:28
The arm64 JIT supports BPF_ST with BPF_PROBE_MEM32 (arena) by using the
tmp2 register to hold the dst + arena_vm_base value and using tmp2 as the
new dst register. But this is broken because in case is_lsi_offset()
returns false the tmp2 will be clobbered by emit_a64_mov_i(1, tmp2, off,
ctx); and hence the emitted store instruction will be of the form:

	strb    w10, [x11, x11]

Fix this by using the third temporary register to hold the dst +
arena_vm_base.

Fixes: 339af57 ("bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.")
Signed-off-by: Puranjay Mohan <[email protected]>
vm_area_map_pages() may allocate memory while inserting pages into bpf
arena's vm_area. In order to make bpf_arena_alloc_pages() kfunc
non-sleepable change bpf arena to populate pages without
allocating memory:
- at arena creation time populate all page table levels except
  the last level
- when new pages need to be inserted call apply_to_page_range() again
  with apply_range_set_cb() which will only set_pte_at() those pages and
  will not allocate memory.
- when freeing pages call apply_to_existing_page_range with
  apply_range_clear_cb() to clear the pte for the page to be removed. This
  doesn't free intermediate page table levels.

Signed-off-by: Puranjay Mohan <[email protected]>
To make arena_alloc_pages() safe to be called from any context, the
kvcalloc() has to be replaced with kmalloc_nolock() so as it doesn't sleep
or take any locks.

Replace __free_page() with free_pages_nolock() that can be called while
holding raw_spin_lock or from IRQ and NMI for any page type (not only
those that came from alloc_pages_nolock)

Signed-off-by: Puranjay Mohan <[email protected]>
Make arena_free_pages() any context safe by deferring the main logic to
a workqueue if it is called from a non-sleepable context.

specialize_kfunc() is used to replace the sleepable arena_free_pages()
with bpf_arena_free_pages_non_sleepable() when the verifier detects the
call is from a non-sleepable context.

In the non-sleepable case, arena_free_pages() queues the address and the
page count to be freed to a lock-less list of struct arena_free_spans
and raises an irq_work. The irq_work handler calls schedules_work() as
it is safe to be called from irq context.  arena_free_worker() (the work
queue handler) iterates these spans and clears ptes, flushes tlb, zaps
pages, and calls __free_page().

Signed-off-by: Puranjay Mohan <[email protected]>
As arena kfuncs can now be called from non-sleepable contexts, test this
by adding non-sleepable copies of tests in verifier_arena, this is done
by using a socket program instead of syscall.

Augment the arena_list selftest to also run in non-sleepable context by
taking rcu_read_lock.

Signed-off-by: Puranjay Mohan <[email protected]>
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 7 times, most recently from de0745f to efe6edf Compare November 6, 2025 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant