[dpdk-dev] thoughts on DPDK after a few days of reading sources

Seth Arnold seth.arnold at canonical.com
Thu Feb 11 04:05:40 CET 2016


Hello,

The Ubuntu distribution is looking at supporting DPDK in the 'main'
component of the archive. As part of this process I spent a few days
reading the DPDK sources to gauge if we can support it or not.

I've taken some notes while reading the sources; I'm sharing them in the
hopes that it's useful: on the one hand my fresh eyes may spot things that
you've overlooked, on the other hand your familiarity with the code means
that you're better suited to judge what I've found.

Most of the code was very good; I am however concerned about the frequent
memory allocations that use simple integer arithmetic when deciding how
much to allocate without checking for integer overflows.

Here's the bug tracking the Main Inclusion Request:
https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1492186

Here's the portions of the notes that I think may be most interesting
to DPDK devs. It's in rough priority order so feel free to stop reading
when you're bored:

===

Affects main codebase and should be investigated quickly:

- shellcheck reports extensive cases of forgotten quotes to prevent word
  splitting or globbing, potentially unused variables, error-prone printf
  formatting. The scripts that are going to be used at runtime should be
  fixed:
  - ./debian/dpdk-init
  - ./debian/dpdk.init
  - ./tools/setup.sh ? (Hard to tell)
- ./drivers/net/cxgbe/cxgbe_ethdev.c eth_cxgbe_dev_init() memory leak in
  out_free_adapter: that doesn't free adapter
- ./drivers/net/virtio/virtio_ethdev.c virtio_set_multiple_queues() calls
  virtio_send_command(), which performs:
  memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl));
  This copies a potentially huge amount of uninitialized data into ->addr
  because the struct virtio_pmd_ctrl ctrl was not zeroed before being
  passed. How much of this data leaves the system? Does this require a
  CVE?
- ./lib/librte_eal/common/rte_malloc.c memory allocation routines don't
  check for integer overflow errors:
  - rte_calloc_socket()
  - rte_calloc()
- ./lib/librte_vhost/vhost_user/virtio-net-user.c user_set_mem_table()
  doesn't perform integer overflow checks before calling calloc()
- ./lib/librte_vhost/vhost_cuse/virtio-net-cdev.c cuse_set_mem_table()
  doesn't perform integer overflow checks before calling calloc()
- ./lib/librte_eal/linuxapp/eal/eal_xen_memory.c rte_xen_dom0_memory_init()
  If vma_addr = xen_get_virtual_area(&vma_len, RTE_PGSIZE_2M); fails,
  vma_len is reset to RTE_PGSIZE_2M instead of seginfo[memseg_idx].size --
  is this a bug?
- ./lib/librte_eal/linuxapp/eal/eal_memory.c create_shared_memory()
  creates the hugetlb file with mode 0666 rather than 0600 -- is this a
  bug? Does this require a CVE?
- ./lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
  rte_cpu_get_features() leaks auxv_fd
- ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_32.h
  rte_cpu_get_features() leaks auxv_fd
- ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h
  rte_cpu_get_features() leaks auxv_fd

Affects main codebase:

- Assorted false-positives and style issues reported by cppcheck
- Slightly dangerous convention of memcpy(dest, source, sizeof(source))
  is used extensively; while all the instances I investigated were
  correct, it's still more prone to mistakes under maintenance than
  memcpy(dest, source, sizeof(dest)).
- extensive use of *malloc() wrappers that perform multiplication to
  determine the size to allocate; all the cases I've seen used values that
  should be constrained by system configurations but the habit is
  dangerous compared to use of *calloc() wrappers that handle integer
  overflow safely.
- ./app/test/test_malloc.c test_reordered_free_per_lcore() has incorrect
  calls to is_memory_overlap() -- p2 is 16000 bytes long, not 1000 bytes
  long, and this size difference is not reflected in the calls.
- ./app/test/test_malloc.c test_realloc() leaks ptr1, may leak ptr9 via
  error path
- ./app/test/test_malloc.c test_realloc() the test with error "Unexpected
  - ptr4 != ptr3" doesn't feel like it tests an actual promise from the API
- ./lib/librte_pipeline/rte_pipeline.c rte_pipeline_table_create(),
  rte_pipeline_port_in_create(), rte_pipeline_port_out_create(),
  duplicate the array of function pointers via memcpy() rather than
  copying a pointer to static tables -- this may represent an easy way to
  save memory and improve cache hit ratios as well as potentially allow
  storing the tables in static memory rather than on the heap, reducing
  the value of these structs in potential exploits.
- ixgbe driver in the package is very different from the driver in the
  Linux kernel -- when bugs in one are found, who is in charge of copying
  the fixes between the two code bases?
- ./lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_ethtool.c
  ixgbe_get_strings() takes a buffer to write into but not the buffer
  length; sprintf() calls may overflow the buffer if it isn't large
  enough. It looks like ethtool_get_strings() may use fixed-size buffer
  but the amount of data that gets written into it is based on both static
  and dynamic data. Can this overflow the buffer?
- ./lib/librte_eal/linuxapp/eal/eal.c rte_eal_config_create() uses mode
  0660 rather than 0600 to create /home/username/.<something>_config
  files, which may be too open on distributions with "staff" or "users"
  default user groups rather than user-specific groups.

Affects examples but should still be investigated quickly:

- print_stats() in examples/l2fwd-ivshmem/host/host.c uses variable
  total_vm_packets_dropped without initialization

Affects examples:

- app_config_preproc() from ./examples/ip_pipeline/config_parse.c builds a
  string to execute with system(). While the inputs come from the command
  line arguments, they might have been supplied via a safe mechanism from
  untrusted users, rendering them unsafe here. The snprintf() error return
  is ignored. The access(app->config_file) check isn't as helpful as an
  actual error message if something does go wrong.
- main() from ./examples/kni/main.c ignores kni_alloc() return value
- main() from ./examples/kni/main.c may allow port to grow beyond
  RTE_MAX_ETHPORTS and then aborts rather than capping the upper end of
  the for() loop.
- ./examples/vhost/main.c txmbuf_clean_zcp():
        uint32_t used_idx = vq->last_used_idx & (vq->size - 1);
  Use of & rather than % for modular arithmetic -- are virtuqueue sizes
  guaranteed to be powers of two?
- Multiple places omits errno in the case the file open fails:
  - ./examples/ip_pipeline/config_parse.c app_config_save()
  - ./examples/ip_pipeline/pipeline/pipeline_common_fe.c app_run_file()
  - ./examples/ethtool/ethtool-app/ethapp.c pcmd_eeprom_callback()
  - other locations of this aren't mentioned.
  It's nearly impossible to solve issues without error reporting. Good
  error reporting saves admins time and money.

===

Thanks


More information about the dev mailing list