In fact, I didn't see any performance drop because I'm still trying it in a small and initial phase, just want to eliminate any risk when move to debian11, you know if issue happen in production environment, it maybe really not easy to debug.
Back to the question: in lava, if we use debian11 which default cgroupV2 enabled, then when use docker-test-shell, lava will attach a custom BPF Device program to container to replace the default one in docker.
Everything looks fine, just I observed if I use "adb devices" in container, then the trace_pipe will be flushed with next:
``` device poll-7289 [001] d... 103054.767620: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103055.767851: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103056.768117: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103057.768354: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103058.768590: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103059.768819: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103060.769053: bpf_trace_printk: Device access: major = 189, minor = 261 ```
Which means that bpf function frequently be called (interval less than 1 second)
On the other hand, if I do next then the BPF prog unregistered from linux kernel, but looks every adb devices still works. ``` /sys/fs/cgroup/system.slice/docker-a9354f54a8c6a56932e15b4d577432abf86c897630d5e94da442474e938bf875.scope 78 device multi lava_docker_dev $ bpftool cgroup detach /sys/fs/cgroup/system.slice/docker-a9354f54a8c6a56932e15b4d577432abf86c897630d5e94da442474e938bf875.scope device id 78 ```
So, I just want to confirm have you guys noticed this behavior, and you confirm this behavior is ok? (To be honestly, I'm not sure BPF performance if it's frequently be called, so this is just a enquire) Or, we have better methods handle it in lava? I need your confirm to decide if I need to downgrade to CGroupV1 when I migrate, thanks!
Regards, Larry
On Thu, Sep 29, 2022 at 08:52:59AM +0000, Larry Shen wrote:
In fact, I didn't see any performance drop because I'm still trying it in a small and initial phase, just want to eliminate any risk when move to debian11, you know if issue happen in production environment, it maybe really not easy to debug.
Back to the question: in lava, if we use debian11 which default cgroupV2 enabled, then when use docker-test-shell, lava will attach a custom BPF Device program to container to replace the default one in docker.
Everything looks fine, just I observed if I use "adb devices" in container, then the trace_pipe will be flushed with next:
device poll-7289 [001] d... 103054.767620: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103055.767851: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103056.768117: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103057.768354: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103058.768590: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103059.768819: bpf_trace_printk: Device access: major = 189, minor = 261 device poll-7289 [001] d... 103060.769053: bpf_trace_printk: Device access: major = 189, minor = 261
Which means that bpf function frequently be called (interval less than 1 second)
On the other hand, if I do next then the BPF prog unregistered from linux kernel, but looks every adb devices still works.
/sys/fs/cgroup/system.slice/docker-a9354f54a8c6a56932e15b4d577432abf86c897630d5e94da442474e938bf875.scope 78 device multi lava_docker_dev $ bpftool cgroup detach /sys/fs/cgroup/system.slice/docker-a9354f54a8c6a56932e15b4d577432abf86c897630d5e94da442474e938bf875.scope device id 78
So, I just want to confirm have you guys noticed this behavior, and you confirm this behavior is ok? (To be honestly, I'm not sure BPF performance if it's frequently be called, so this is just a enquire) Or, we have better methods handle it in lava? I need your confirm to decide if I need to downgrade to CGroupV1 when I migrate, thanks!
This has been in use in Linaro for a long time now and we did not see any issues with it. The only way to control device access control from container usings cgroups v2 is using BPF, so there is no other way to do it.
The fact that there was several calls in the logs is just because the BPF function gets called every time a process in the container tries to use the device, so that is expected.
lava-users@lists.lavasoftware.org