
4

eBPF Talk: bpf2bpf & tailcall 报错分析
source link: https://asphaltt.github.io/post/ebpf-talk-35-bpf2bpf-and-tailcall-error/
eBPF Talk: bpf2bpf & tailcall 报错分析
2023-05-23
约 864 字
预计阅读 2 分钟
7 次阅读
终于要将 bpf2bpf
& tailcall
的组合落地到项目中了。
不过,当头一棒的是:tail_calls are not allowed when call stack of previous frames is 256 bytes. Too large
。
幸好学习过 eBPF Talk: tailcall on x86,知道该报错出在哪里:
// ${KERNEL}/kernel/bpf/verifier.c
/* starting from main bpf function walk all instructions of the function
* and recursively walk all callees that given function can call.
* Ignore jump and exit insns.
* Since recursion is prevented by check_cfg() this algorithm
* only needs a local stack of MAX_CALL_FRAMES to remember callsites
*/
static int check_max_stack_depth(struct bpf_verifier_env *env)
{
// ...
process_func:
/* protect against potential stack overflow that might happen when
* bpf2bpf calls get combined with tailcalls. Limit the caller's stack
* depth for such case down to 256 so that the worst case scenario
* would result in 8k stack size (32 which is tailcall limit * 256 =
* 8k).
*
* To get the idea what might happen, see an example:
* func1 -> sub rsp, 128
* subfunc1 -> sub rsp, 256
* tailcall1 -> add rsp, 256
* func2 -> sub rsp, 192 (total stack size = 128 + 192 = 320)
* subfunc2 -> sub rsp, 64
* subfunc22 -> sub rsp, 128
* tailcall2 -> add rsp, 128
* func3 -> sub rsp, 32 (total stack size 128 + 192 + 64 + 32 = 416)
*
* tailcall will unwind the current stack frame but it will not get rid
* of caller's stack as shown on the example above.
*/
if (idx && subprog[idx].has_tail_call && depth >= 256) {
verbose(env,
"tail_calls are not allowed when call stack of previous frames is %d bytes. Too large\n",
depth);
return -EACCES;
}
// ...
}
就能够明白,报错是因为 bpf prog 消耗的栈空间不小于 256 字节。
该怎么解决呢?
解决办法不就是减小栈空间的使用量嘛。于是,我便分析了一下栈空间的消耗:
- 有个
struct tunnel
,消耗 44 字节。 - 有个
struct sflow
,消耗 48 字节。 - 有个
bpf_printk()
,消耗 77 字节。
而栈深度的计算方式是向上取整(32 的倍数),超过 224 字节就取 256 字节。
// ${KERNEL}/kernel/bpf/core.c
#define PROG_NAME_LIST(stack_size) PROG_NAME_ARGS(stack_size),
static u64 (*interpreters_args[])(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5,
const struct bpf_insn *insn) = {
EVAL6(PROG_NAME_LIST, 32, 64, 96, 128, 160, 192)
EVAL6(PROG_NAME_LIST, 224, 256, 288, 320, 352, 384)
EVAL4(PROG_NAME_LIST, 416, 448, 480, 512)
};
#undef PROG_NAME_LIST
void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth)
{
stack_depth = max_t(u32, stack_depth, 1);
insn->off = (s16) insn->imm;
insn->imm = interpreters_args[(round_up(stack_depth, 32) / 32) - 1] -
__bpf_call_base_args;
insn->code = BPF_JMP | BPF_CALL_ARGS;
}
“杀栈凶手” 就是那个 bpf_printk()
,干掉后就不再报这个错了。
Q:
bpf_printk()
为什么消耗的是栈空间?A: 如果
bpf_printk()
的字符串不放在栈里,放在哪里?通过查看 BPF ELF 文件里的汇编、或者bpftool prog dump xlated
,可以确认bpf_printk()
的字符串就是放在栈上:
# bpftool prog dump xlated id ${PROG ID}
1301: (73) *(u8 *)(r10 -12) = r1
1302: (b7) r1 = 175650864
1303: (63) *(u32 *)(r10 -16) = r1
1304: (18) r1 = 0x2578303d72646461
1306: (7b) *(u64 *)(r10 -24) = r1
1307: (18) r1 = 0x6420783830257830
1309: (7b) *(u64 *)(r10 -32) = r1
1310: (18) r1 = 0x3d72646461732079
1312: (7b) *(u64 *)(r10 -40) = r1
1313: (18) r1 = 0x616c7265766f202c
1315: (7b) *(u64 *)(r10 -48) = r1
1316: (7b) *(u64 *)(r10 -56) = r7
1317: (61) r4 = *(u32 *)(r9 +16)
1318: (61) r3 = *(u32 *)(r9 +12)
1319: (bf) r1 = r10
;
1320: (07) r1 += -56
; bpf_printk("acl-test, overlay saddr=0x%08x daddr=0x%08x\n", iph->saddr, iph->daddr);
1321: (b7) r2 = 45
1322: (85) call bpf_trace_printk#-62736
Recommend
About Archive Link
everyday a lot of link has gone away.
archive.link will keep it forever.