Skip to content

bpf: Introduce global percpu data#12395

Open
kernel-patches-daemon-bpf[bot] wants to merge 9 commits into
bpf-next_basefrom
series/1107932=>bpf-next
Open

bpf: Introduce global percpu data#12395
kernel-patches-daemon-bpf[bot] wants to merge 9 commits into
bpf-next_basefrom
series/1107932=>bpf-next

Conversation

@kernel-patches-daemon-bpf

Copy link
Copy Markdown

Pull request for series with
subject: bpf: Introduce global percpu data
version: 5
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 50dff00
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-review-bot

Copy link
Copy Markdown
commit af562e49f6bb5fc55f01e80ac7a89327ad0f3bb3
Author: Leon Hwang <leon.hwang@linux.dev>

bpftool: Generate skeleton for global percpu data

This patch teaches bpftool to emit skeleton code for global percpu
variables, adding a nested per-object percpu structure and handling the
new .percpu data section in the generated open/load/destroy paths.

> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> index 6ae7262ebe0c..e0c5c643fe9e 100644
> --- a/tools/bpf/bpftool/gen.c
> +++ b/tools/bpf/bpftool/gen.c

[ ... ]

> @@ -254,6 +254,11 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
>  	return NULL;
>  }
>
> +static bool bpf_map_is_percpu_data(const struct bpf_map *map)
> +{
> +	return bpf_map__is_internal(map) && bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY;
> +}
> +

[ ... ]

> @@ -847,7 +851,7 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h
>  	bpf_object__for_each_map(map, obj) {
>  		const char *mmap_flags;
>
> -		if (!is_mmapable_map(map, ident, sizeof(ident)))
> +		if (!is_mmapable_map(map, ident, sizeof(ident)) || bpf_map_is_percpu_data(map))
>  			continue;

By skipping the .percpu map here during light skeleton generation in
gen_trace(), is the user-space buffer for .percpu left writable?

If a user modifies skel->percpu after the light skeleton is loaded, will
that silently succeed without actually updating the kernel per-cpu map?

The regular skeleton path appears protected, since libbpf mprotect()s the
percpu mmaped buffer read-only at load in bpf_object__populate_internal_map().
For the light skeleton, gen_trace() skips the .percpu map, so its initial
value buffer is never finalized with skel_finalize_map_data() and never
mprotect()'d to PROT_READ, leaving it writable.

Does the generated light skeleton need an explicit mprotect(..., PROT_READ)
on the .percpu initial value buffer to enforce the read-only constraint
described in the commit message?

This was raised on v4 (https://lore.kernel.org/bpf/20260414212650.4F89BC19425@smtp.kernel.org/)
where the response was that the mprotect(..., PROT_READ) added in patch #4
should be moved into this patch so it isn't missed. The v5 code here looks
unchanged from v4, and the mprotect does not appear to have been moved in.


AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpftool: Generate skeleton for global percpu data
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27147120330

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

@kernel-patches-review-bot

Copy link
Copy Markdown
commit 5b7279c1a85c0a8f7dd35d670cf03f0a0256b5ad
Author: Leon Hwang <leon.hwang@linux.dev>

bpf: Introduce global percpu data

This patch adds global percpu data support, letting BPF programs define
percpu variables via SEC(".percpu") backed by a single-entry
BPF_MAP_TYPE_PERCPU_ARRAY.  It wires up map_direct_value_addr/meta for
percpu arrays and patches ld_imm64 of a percpu map value into a percpu
address mov at JIT fixup time.

Links:
[1] https://github.com/anakryiko/retsnoop
[2] https://github.com/bpfsnoop/bpfsnoop

> diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
> index b2a19acadb91..5787246bef30 100644
> --- a/kernel/bpf/const_fold.c
> +++ b/kernel/bpf/const_fold.c
> @@ -182,7 +182,6 @@ static void const_reg_xfer(struct bpf_verifier_env *env, struct const_arg_info *
>  		u64 val = 0;
>
>  		if (!bpf_map_is_rdonly(map) || !map->ops->map_direct_value_addr ||
> -		    map->map_type == BPF_MAP_TYPE_INSN_ARRAY ||
>  		    off < 0 || off + size > map->value_size ||
>  		    bpf_map_direct_read(map, off, size, &val, is_ldsx)) {
>  			*dst = unknown;

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 2690d063a240..eecd9ab82e91 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5556,6 +5556,8 @@ int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val,
>  	u64 addr;
>  	int err;
>
> +	if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY || map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
> +		return -EINVAL;
>  	err = map->ops->map_direct_value_addr(map, &addr, off);

Here bpf_map_direct_read() now refuses both insn arrays and percpu arrays,
which makes sense since a percpu value address cannot be constant-folded
at verification time.

The const_fold.c hunk above was adjusted to cope with this: it dropped its
own BPF_MAP_TYPE_INSN_ARRAY pre-check and instead lets a bpf_map_direct_read()
failure fall through to "*dst = unknown".

Is the equivalent read path in check_mem_access() missing the same treatment?

kernel/bpf/verifier.c:check_mem_access() {
    ...
    /*
     * If map is read-only, track its contents as scalars,
     * unless it is an insn array (see the special case below)
     */
    if (tnum_is_const(reg->var_off) &&
        bpf_map_is_rdonly(map) &&
        map->ops->map_direct_value_addr &&
        map->map_type != BPF_MAP_TYPE_INSN_ARRAY) {
        int map_off = off + reg->var_off.value;
        u64 val = 0;

        err = bpf_map_direct_read(map, map_off, size, &val, is_ldsx);
        if (err)
            return err;
    ...
}

Since percpu_array_map_ops now sets map_direct_value_addr, a frozen percpu
array created with BPF_F_RDONLY_PROG satisfies this condition (it is rdonly,
has map_direct_value_addr, and is not an insn array).  It enters the branch,
calls bpf_map_direct_read(), gets the new -EINVAL, and returns it.

Does that mean a plain LDX from such a read-only percpu value is now rejected
at load time with a bare -EINVAL and no verbose() message, rather than being
tracked as an unknown scalar the way the const_fold.c path handles it (and the
way insn arrays are explicitly handled in the else-if below)?

Should this branch also exclude BPF_MAP_TYPE_PERCPU_ARRAY, so the load falls
through to mark_reg_unknown() instead of failing?



AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf: Introduce global percpu data
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27147120330

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: b9452b5
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: dd0f968
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: f1a660b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 68f4e48
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: c15261b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 140fa23
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 2e8ad1f
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

Asphaltt added 5 commits June 10, 2026 16:26
There are many adjacent blank lines in the verifier that have accumulated
over time.

Drop them for cleanup.

No functional changes intended.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Introduce global percpu data, inspired by the commit
6316f78 ("Merge branch 'support-global-data'"). It enables the
definition of global percpu variables in BPF, similar to the
include/linux/percpu-defs.h::DEFINE_PER_CPU() macro.

For example, in BPF, it is able to define a global percpu variable like:

int data SEC(".percpu");

With this patch, tools like retsnoop [1] and bpfsnoop [2] can simplify
their BPF code for handling LBRs. The code can be updated from

static struct perf_branch_entry lbrs[1][MAX_LBR_ENTRIES] SEC(".data.lbrs");

to

static struct perf_branch_entry lbrs[MAX_LBR_ENTRIES] SEC(".percpu.lbrs");

This eliminates the need to retrieve the CPU ID using the
bpf_get_smp_processor_id() helper.

Additionally, by reusing global percpu data map, sharing information
between tail callers and callees or freplace callers and callees becomes
simpler compared to reusing percpu_array maps.

Links:
[1] https://github.com/anakryiko/retsnoop
[2] https://github.com/bpfsnoop/bpfsnoop

Assisted-by: Codex:gpt-5.5-xhigh
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
libbpf needs a reliable way to distinguish kernels that can support
global percpu data from those that cannot.

Add a dedicated feature probe, so libbpf can make capability decisions
early and fail predictably when global percpu data is unavailable.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Add support for global percpu data in libbpf by adding a new ".percpu"
section, similar to ".data". It enables efficient handling of percpu
global variables in bpf programs.

When generating loader for lightweight skeleton, update the percpu_array
map used for global percpu data using BPF_F_ALL_CPUS, in order to update
values across all CPUs using one value slot.

Unlike global data, the mmaped data for global percpu data will be marked
as read-only after populating the percpu_array map. Thereafter, users can
read those initialized percpu data after loading prog. If they want to
update the percpu data after loading prog, they have to update the
percpu_array map using key=0 instead.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Enhance bpftool to generate skeletons that properly handle global percpu
variables. The generated skeleton now includes a dedicated structure for
percpu data, allowing users to initialize and access percpu variables more
efficiently.

For global percpu variables, the skeleton now includes a nested
structure, e.g.:

struct test_global_percpu_data {
	struct bpf_object_skeleton *skeleton;
	struct bpf_object *obj;
	struct {
		struct bpf_map *percpu;
	} maps;
	// ...
	struct test_global_percpu_data__percpu {
		int data;
		char run;
		struct {
			char set;
			int i;
			int nums[7];
		} struct_data;
		int nums[7];
	} *percpu;

	// ...
};

  * The "struct test_global_percpu_data__percpu *percpu" points to
    initialized data, which is actually "maps.percpu->mmaped".
  * Before loading the skeleton, updating the
    "struct test_global_percpu_data__percpu *percpu" modifies the initial
    value of the corresponding global percpu variables.
  * After loading the skeleton, "maps.percpu->mmaped" has been marked as
    read-only in libbpf. If users want to update the global percpu
    variables, they have to update the "maps.percpu" map instead.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
@kernel-patches-daemon-bpf

Copy link
Copy Markdown
Author

Upstream branch: 30dee2c
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1107932
version: 5

Asphaltt added 4 commits June 10, 2026 16:26
If the arch, like s390x, does not support percpu insn, these cases won't
test global percpu data by checking FEAT_PERCPU_DATA support.

The following APIs have been tested for global percpu data:

1. bpf_map__set_initial_value()
2. bpf_map__initial_value()
3. generated percpu struct pointer pointing to internal map's mmaped data
4. bpf_map__lookup_elem() for global percpu data map

At the same time, the case is also tested with 'bpftool gen skeleton -L'.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Add two tests to verify the verifier log
"R%d points to percpu_array map which cannot be used as const string\n".

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Add a test to verify global percpu data related xlated insns:

1. ld_imm64: compare xlated one with the one in ELF object file.
2. mov64_percpu_reg: it is added by verifier.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Add a test to verify that it is OK to iter the percpu_array map used for
global percpu data.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant