[dpdk-dev] [PATCH v1 0/5] add framework to load and execute BPF code

Alejandro Lucero alejandro.lucero at netronome.com
Wed Mar 14 17:43:02 CET 2018


I tried to start a discussion about eBPF support with DPDK in last DPDK
meeting in Santa Clara:

https://dpdksummit.com/Archive/pdf/2017USA/DPDK%20support%20for%20new%20hardware%20offloads.pdf

In slide 17 I have some points which, IMHO, are worth to discuss before
adding this support.

I can see compatibility with eBPF programs used with the kernel being just
enough for adding this to DPDK, but if I understand where eBPF inside the
kernel is going (regarding network stack), those programs are going to (or
could) refer to kernel "code", so maybe this compatibility is just
impossible to support. That would force a check for avoiding those programs
with such references and I can see this would become in a mess quickly.

Assuming this issue could be overcome (or not an issue at all), maybe it
makes sense to execute eBPF programs but, does it make sense to execute
eBPF code? To start with, we are going to execute userspace code in
userspace context, so some (I would say main) reasons behind eBPF do not
apply. And from a performance point of view, can we ensure eBPF code
execution is going to be at same level than DPDK?  Would not it be a better
idea to translate ebpf programs to other language like ... C?

Don't take me wrong. I'm not against adding eBPF at all. In fact, from my
company's point of view, Netronome, we would be happy to have this with
DPDK and to support eBPF offload as this is possible now with the netdev
driver.


On Fri, Mar 9, 2018 at 4:42 PM, Konstantin Ananyev <
konstantin.ananyev at intel.com> wrote:

> BPF is used quite intensively inside Linux (and BSD) kernels
> for various different purposes and proved to be extremely useful.
>
> BPF inside DPDK might also be used in a lot of places
> for a lot of similar things.
>  As an example to:
> - packet filtering/tracing (aka tcpdump)
> - packet classification
> - statistics collection
> - HW/PMD live-system debugging/prototyping - trace HW descriptors,
>   internal PMD SW state, etc.
>  ...
>
> All of that in a dynamic, user-defined and extensible manner.
>
> So these series introduce new library - librte_bpf.
> librte_bpf provides API to load and execute BPF bytecode within
> user-space dpdk app.
> It supports basic set of features from eBPF spec.
> Also it introduces basic framework to load/unload BPF-based filters
> on eth devices (right now via SW RX/TX callbacks).
>
> How to try it:
> ===============
>
> 1) run testpmd as usual and start your favorite forwarding case.
> 2) build bpf program you'd like to load
> (you'll need clang v3.7 or above):
> $ cd test/bpf
> $ clang -O2 -target bpf -c t1.c
>
> 3) load bpf program(s):
> testpmd> bpf-load rx|tx <portid> <queueid> <load-flags> <bpf-prog-filename>
>
> <load-flags>:  [-][J][M]
> J - use JIT generated native code, otherwise BPF interpreter will be used.
> M - assume input parameter is a pointer to rte_mbuf,
>     otherwise assume it is a pointer to first segment's data.
>
> Few examples:
>
> # to load (not JITed) dummy.o at TX queue 0, port 0:
> testpmd> bpf-load tx 0 0 - ./dpdk.org/test/bpf/dummy.o
>
> #to load (and JIT compile) t1.o at RX queue 0, port 1:
> testpmd> bpf-load rx 1 0 J ./dpdk.org/test/bpf/t1.o
>
> #to load and JIT t3.o (note that it expects mbuf as an input):
> testpmd> bpf-load rx 2 0 JM ./dpdk.org/test/bpf/t3.o
>
> If you are curious to check JIT generated native code:
> gdb -p `pgrep testpmd`
> (gdb) disas 0x7fd173c5f000,+76
> Dump of assembler code from 0x7fd173c5f000 to 0x7fd173c5f04c:
>    0x00007fd173c5f000:  mov    %rdi,%rsi
>    0x00007fd173c5f003:  movzwq 0x10(%rsi),%rdi
>    0x00007fd173c5f008:  mov    0x0(%rsi),%rdx
>    0x00007fd173c5f00c:  add    %rdi,%rdx
>    0x00007fd173c5f00f:  movzbq 0xc(%rdx),%rdi
>    0x00007fd173c5f014:  movzbq 0xd(%rdx),%rdx
>    0x00007fd173c5f019:  shl    $0x8,%rdx
>    0x00007fd173c5f01d:  or     %rdi,%rdx
>    0x00007fd173c5f020:  cmp    $0x608,%rdx
>    0x00007fd173c5f027:  jne    0x7fd173c5f044
>    0x00007fd173c5f029:  mov    $0xb712e8,%rdi
>    0x00007fd173c5f030:  mov    0x0(%rdi),%rdi
>    0x00007fd173c5f034:  mov    $0x40,%rdx
>    0x00007fd173c5f03b:  mov    $0x4db2f0,%rax
>    0x00007fd173c5f042:  callq  *%rax
>    0x00007fd173c5f044:  mov    $0x1,%rax
>    0x00007fd173c5f04b:  retq
> End of assembler dump.
>
> 4) observe changed traffic behavior
> Let say with the examples above:
>   - dummy.o  does literally nothing, so no changes should be here,
>     except some possible slowdown.
>  - t1.o - should force to drop all packets that doesn't match:
>    'dst 1.2.3.4 && udp && dst port 5000' filter.
>  - t3.o - should dump to stdout ARP packets.
>
> 5) unload some or all bpf programs:
> testpmd> bpf-unload tx 0 0
>
> 6) continue with step 3) or exit
>
> TODO list:
> ==========
> - meson build
> - UT for it
> - implement proper validate()
> - allow JIT to generate bulk version
> - FreeBSD support
>
> Not currently supported features:
> =================================
> - cBPF
> - tail-pointer call
> - eBPF MAP
> - JIT for non X86_64 targets
> - skb
>
> Konstantin Ananyev (5):
>   bpf: add BPF loading and execution framework
>   bpf: add JIT compilation for x86_64 ISA.
>   bpf: introduce basic RX/TX BPF filters
>   testpmd: new commands to load/unload BPF filters
>   test: add few eBPF samples
>
>  app/test-pmd/bpf_sup.h             |   25 +
>  app/test-pmd/cmdline.c             |  146 ++++
>  config/common_base                 |    5 +
>  config/common_linuxapp             |    1 +
>  lib/Makefile                       |    2 +
>  lib/librte_bpf/Makefile            |   35 +
>  lib/librte_bpf/bpf.c               |   52 ++
>  lib/librte_bpf/bpf_exec.c          |  452 ++++++++++++
>  lib/librte_bpf/bpf_impl.h          |   37 +
>  lib/librte_bpf/bpf_jit_x86.c       | 1329 ++++++++++++++++++++++++++++++
> ++++++
>  lib/librte_bpf/bpf_load.c          |  380 +++++++++++
>  lib/librte_bpf/bpf_pkt.c           |  524 ++++++++++++++
>  lib/librte_bpf/bpf_validate.c      |   55 ++
>  lib/librte_bpf/rte_bpf.h           |  158 +++++
>  lib/librte_bpf/rte_bpf_ethdev.h    |   50 ++
>  lib/librte_bpf/rte_bpf_version.map |   16 +
>  mk/rte.app.mk                      |    2 +
>  test/bpf/dummy.c                   |   20 +
>  test/bpf/mbuf.h                    |  556 +++++++++++++++
>  test/bpf/t1.c                      |   53 ++
>  test/bpf/t2.c                      |   30 +
>  test/bpf/t3.c                      |   36 +
>  22 files changed, 3964 insertions(+)
>  create mode 100644 app/test-pmd/bpf_sup.h
>  create mode 100644 lib/librte_bpf/Makefile
>  create mode 100644 lib/librte_bpf/bpf.c
>  create mode 100644 lib/librte_bpf/bpf_exec.c
>  create mode 100644 lib/librte_bpf/bpf_impl.h
>  create mode 100644 lib/librte_bpf/bpf_jit_x86.c
>  create mode 100644 lib/librte_bpf/bpf_load.c
>  create mode 100644 lib/librte_bpf/bpf_pkt.c
>  create mode 100644 lib/librte_bpf/bpf_validate.c
>  create mode 100644 lib/librte_bpf/rte_bpf.h
>  create mode 100644 lib/librte_bpf/rte_bpf_ethdev.h
>  create mode 100644 lib/librte_bpf/rte_bpf_version.map
>  create mode 100644 test/bpf/dummy.c
>  create mode 100644 test/bpf/mbuf.h
>  create mode 100644 test/bpf/t1.c
>  create mode 100644 test/bpf/t2.c
>  create mode 100644 test/bpf/t3.c
>
> --
> 2.13.6
>
>


More information about the dev mailing list