[dpdk-dev] [PATCH v3 00/68] Memory Hotplug for DPDK
Anatoly Burakov
anatoly.burakov at intel.com
Wed Apr 4 01:21:12 CEST 2018
This patchset introduces dynamic memory allocation for DPDK (aka memory
hotplug). Based upon RFC submitted in December [1].
Dependencies (to be applied in specified order):
- IPC asynchronous request API patch [2]
- Function to return number of sockets [3]
- EAL IOVA fix [4]
Deprecation notices relevant to this patchset:
- General outline of memory hotplug changes [5]
- EAL NUMA node count changes [6]
The vast majority of changes are in the EAL and malloc, the external API
disruption is minimal: a new set of API's are added for contiguous memory
allocation for rte_memzone, and a few API additions in rte_memory due to
switch to memseg_lists as opposed to memsegs. Every other API change is
internal to EAL, and all of the memory allocation/freeing is handled
through rte_malloc, with no externally visible API changes.
Quick outline of all changes done as part of this patchset:
* Malloc heap adjusted to handle holes in address space
* Single memseg list replaced by multiple memseg lists
* VA space for hugepages is preallocated in advance
* Added alloc/free for pages happening as needed on rte_malloc/rte_free
* Added contiguous memory allocation API's for rte_memzone
* Added convenience API calls to walk over memsegs
* Integrated Pawel Wodkowski's patch for registering/unregistering memory
with VFIO [7]
* Callbacks for registering memory allocations
* Callbacks for allowing/disallowing allocations above specified limit
* Multiprocess support done via DPDK IPC introduced in 18.02
The biggest difference is a "memseg" now represents a single page (as opposed to
being a big contiguous block of pages). As a consequence, both memzones and
malloc elements are no longer guaranteed to be physically contiguous, unless
the user asks for it at reserve time. To preserve whatever functionality that
was dependent on previous behavior, a legacy memory option is also provided,
however it is expected (or perhaps vainly hoped) to be temporary solution.
Why multiple memseg lists instead of one? Since memseg is a single page now,
the list of memsegs will get quite big, and we need to locate pages somehow
when we allocate and free them. We could of course just walk the list and
allocate one contiguous chunk of VA space for memsegs, but this
implementation uses separate lists instead in order to speed up many
operations with memseg lists.
For v3, the following limitations are present:
- VFIO support is only smoke-tested (but is expected to work), VFIO support
with secondary processes is not tested; work is ongoing to validate VFIO
for all use cases
- FSLMC bus VFIO code is not yet integrated, work is in progress
For testing, it is recommended to use the GitHub repository [8], as it will
have all of the dependencies already integrated.
v3:
- Lots of compile fixes
- Fixes for multiprocess synchronization
- Introduced support for sPAPR IOMMU, courtesy of Gowrishankar @ IBM
- Fixes for mempool size calculation
- Added convenience memseg walk() API's
- Added alloc validation callback
v2: - fixed deadlock at init
- reverted rte_panic changes at init, this is now handled inside IPC
[1] http://dpdk.org/dev/patchwork/bundle/aburakov/Memory_RFC/
[2] http://dpdk.org/dev/patchwork/bundle/aburakov/IPC_Async_Request/
[3] http://dpdk.org/dev/patchwork/bundle/aburakov/Num_Sockets/
[4] http://dpdk.org/dev/patchwork/bundle/aburakov/IOVA_mode_fixes/
[5] http://dpdk.org/dev/patchwork/patch/34002/
[6] http://dpdk.org/dev/patchwork/patch/33853/
[7] http://dpdk.org/dev/patchwork/patch/24484/
[8] https://github.com/anatolyburakov/dpdk
Anatoly Burakov (68):
eal: move get_virtual_area out of linuxapp eal_memory.c
eal: move all locking to heap
eal: make malloc heap a doubly-linked list
eal: add function to dump malloc heap contents
test: add command to dump malloc heap contents
eal: make malloc_elem_join_adjacent_free public
eal: make malloc free list remove public
eal: make malloc free return resulting malloc element
eal: replace panics with error messages in malloc
eal: add backend support for contiguous allocation
eal: enable reserving physically contiguous memzones
ethdev: use contiguous allocation for DMA memory
crypto/qat: use contiguous allocation for DMA memory
net/avf: use contiguous allocation for DMA memory
net/bnx2x: use contiguous allocation for DMA memory
net/cxgbe: use contiguous allocation for DMA memory
net/ena: use contiguous allocation for DMA memory
net/enic: use contiguous allocation for DMA memory
net/i40e: use contiguous allocation for DMA memory
net/qede: use contiguous allocation for DMA memory
net/virtio: use contiguous allocation for DMA memory
net/vmxnet3: use contiguous allocation for DMA memory
net/bnxt: use contiguous allocation for DMA memory
mempool: add support for the new allocation methods
eal: add function to walk all memsegs
bus/fslmc: use memseg walk instead of iteration
bus/pci: use memseg walk instead of iteration
net/mlx5: use memseg walk instead of iteration
eal: use memseg walk instead of iteration
mempool: use memseg walk instead of iteration
test: use memseg walk instead of iteration
vfio/type1: use memseg walk instead of iteration
vfio/spapr: use memseg walk instead of iteration
eal: add contig walk function
virtio: use memseg contig walk instead of iteration
eal: add iova2virt function
bus/dpaa: use iova2virt instead of memseg iteration
bus/fslmc: use iova2virt instead of memseg iteration
crypto/dpaa_sec: use iova2virt instead of memseg iteration
eal: add virt2memseg function
bus/fslmc: use virt2memseg instead of iteration
net/mlx4: use virt2memseg instead of iteration
net/mlx5: use virt2memseg instead of iteration
crypto/dpaa_sec: use virt2memseg instead of iteration
eal: use memzone walk instead of iteration
vfio: allow to map other memory regions
eal: add "legacy memory" option
eal: add rte_fbarray
eal: replace memseg with memseg lists
eal: replace memzone array with fbarray
eal: add support for mapping hugepages at runtime
eal: add support for unmapping pages at runtime
eal: add "single file segments" command-line option
eal: add API to check if memory is contiguous
eal: prepare memseg lists for multiprocess sync
eal: read hugepage counts from node-specific sysfs path
eal: make use of memory hotplug for init
eal: share hugepage info primary and secondary
eal: add secondary process init with memory hotplug
eal: enable memory hotplug support in rte_malloc
eal: add support for multiprocess memory hotplug
eal: add support for callbacks on memory hotplug
eal: enable callbacks on malloc/free and mp sync
vfio: enable support for mem event callbacks
eal: enable non-legacy memory mode
eal: add memory validator callback
eal: enable validation before new page allocation
eal: prevent preallocated pages from being freed
config/common_base | 15 +-
config/defconfig_i686-native-linuxapp-gcc | 3 +
config/defconfig_i686-native-linuxapp-icc | 3 +
config/defconfig_x86_x32-native-linuxapp-gcc | 3 +
config/rte_config.h | 7 +-
doc/guides/rel_notes/deprecation.rst | 9 -
drivers/bus/dpaa/rte_dpaa_bus.h | 12 +-
drivers/bus/fslmc/fslmc_vfio.c | 80 +-
drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 27 +-
drivers/bus/pci/Makefile | 3 +
drivers/bus/pci/linux/pci.c | 28 +-
drivers/bus/pci/meson.build | 3 +
drivers/crypto/dpaa_sec/dpaa_sec.c | 30 +-
drivers/crypto/qat/Makefile | 3 +
drivers/crypto/qat/meson.build | 3 +
drivers/crypto/qat/qat_qp.c | 23 +-
drivers/event/dpaa/Makefile | 3 +
drivers/event/dpaa2/Makefile | 3 +
drivers/mempool/dpaa/Makefile | 3 +
drivers/mempool/dpaa2/Makefile | 3 +
drivers/net/avf/Makefile | 3 +
drivers/net/avf/avf_ethdev.c | 2 +-
drivers/net/bnx2x/Makefile | 3 +
drivers/net/bnx2x/bnx2x.c | 2 +-
drivers/net/bnx2x/bnx2x_rxtx.c | 3 +-
drivers/net/bnxt/Makefile | 3 +
drivers/net/bnxt/bnxt_ethdev.c | 6 +-
drivers/net/bnxt/bnxt_ring.c | 3 +-
drivers/net/bnxt/bnxt_vnic.c | 2 +-
drivers/net/cxgbe/Makefile | 3 +
drivers/net/cxgbe/sge.c | 3 +-
drivers/net/dpaa/Makefile | 3 +
drivers/net/dpaa2/Makefile | 3 +
drivers/net/dpaa2/meson.build | 3 +
drivers/net/ena/Makefile | 3 +
drivers/net/ena/base/ena_plat_dpdk.h | 7 +-
drivers/net/ena/ena_ethdev.c | 10 +-
drivers/net/enic/Makefile | 3 +
drivers/net/enic/enic_main.c | 4 +-
drivers/net/i40e/Makefile | 3 +
drivers/net/i40e/i40e_ethdev.c | 2 +-
drivers/net/i40e/i40e_rxtx.c | 2 +-
drivers/net/i40e/meson.build | 3 +
drivers/net/mlx4/mlx4_mr.c | 17 +-
drivers/net/mlx5/Makefile | 3 +
drivers/net/mlx5/mlx5.c | 25 +-
drivers/net/mlx5/mlx5_mr.c | 18 +-
drivers/net/octeontx/Makefile | 3 +
drivers/net/qede/Makefile | 3 +
drivers/net/qede/base/bcm_osal.c | 5 +-
drivers/net/virtio/virtio_ethdev.c | 8 +-
drivers/net/virtio/virtio_user/vhost_kernel.c | 83 +-
drivers/net/vmxnet3/Makefile | 3 +
drivers/net/vmxnet3/vmxnet3_ethdev.c | 7 +-
lib/librte_eal/bsdapp/eal/Makefile | 4 +
lib/librte_eal/bsdapp/eal/eal.c | 83 +-
lib/librte_eal/bsdapp/eal/eal_hugepage_info.c | 65 +-
lib/librte_eal/bsdapp/eal/eal_memalloc.c | 48 +
lib/librte_eal/bsdapp/eal/eal_memory.c | 222 +++-
lib/librte_eal/bsdapp/eal/meson.build | 1 +
lib/librte_eal/common/Makefile | 2 +-
lib/librte_eal/common/eal_common_fbarray.c | 859 ++++++++++++++++
lib/librte_eal/common/eal_common_memalloc.c | 359 +++++++
lib/librte_eal/common/eal_common_memory.c | 804 ++++++++++++++-
lib/librte_eal/common/eal_common_memzone.c | 274 +++--
lib/librte_eal/common/eal_common_options.c | 13 +-
lib/librte_eal/common/eal_filesystem.h | 30 +
lib/librte_eal/common/eal_hugepages.h | 11 +-
lib/librte_eal/common/eal_internal_cfg.h | 12 +-
lib/librte_eal/common/eal_memalloc.h | 80 ++
lib/librte_eal/common/eal_options.h | 4 +
lib/librte_eal/common/eal_private.h | 33 +
lib/librte_eal/common/include/rte_eal_memconfig.h | 28 +-
lib/librte_eal/common/include/rte_fbarray.h | 353 +++++++
lib/librte_eal/common/include/rte_malloc.h | 10 +
lib/librte_eal/common/include/rte_malloc_heap.h | 6 +
lib/librte_eal/common/include/rte_memory.h | 232 ++++-
lib/librte_eal/common/include/rte_memzone.h | 159 ++-
lib/librte_eal/common/include/rte_vfio.h | 39 +
lib/librte_eal/common/malloc_elem.c | 433 ++++++--
lib/librte_eal/common/malloc_elem.h | 43 +-
lib/librte_eal/common/malloc_heap.c | 704 ++++++++++++-
lib/librte_eal/common/malloc_heap.h | 15 +-
lib/librte_eal/common/malloc_mp.c | 744 ++++++++++++++
lib/librte_eal/common/malloc_mp.h | 86 ++
lib/librte_eal/common/meson.build | 4 +
lib/librte_eal/common/rte_malloc.c | 85 +-
lib/librte_eal/linuxapp/eal/Makefile | 5 +
lib/librte_eal/linuxapp/eal/eal.c | 62 +-
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 218 +++-
lib/librte_eal/linuxapp/eal/eal_memalloc.c | 1124 +++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_memory.c | 1119 ++++++++++++--------
lib/librte_eal/linuxapp/eal/eal_vfio.c | 491 +++++++--
lib/librte_eal/linuxapp/eal/eal_vfio.h | 12 +
lib/librte_eal/linuxapp/eal/meson.build | 1 +
lib/librte_eal/rte_eal_version.map | 33 +-
lib/librte_ether/rte_ethdev.c | 3 +-
lib/librte_mempool/Makefile | 3 +
lib/librte_mempool/meson.build | 3 +
lib/librte_mempool/rte_mempool.c | 138 ++-
test/test/commands.c | 3 +
test/test/test_malloc.c | 30 +-
test/test/test_memory.c | 27 +-
test/test/test_memzone.c | 62 +-
104 files changed, 8434 insertions(+), 1263 deletions(-)
create mode 100644 lib/librte_eal/bsdapp/eal/eal_memalloc.c
create mode 100644 lib/librte_eal/common/eal_common_fbarray.c
create mode 100644 lib/librte_eal/common/eal_common_memalloc.c
create mode 100644 lib/librte_eal/common/eal_memalloc.h
create mode 100644 lib/librte_eal/common/include/rte_fbarray.h
create mode 100644 lib/librte_eal/common/malloc_mp.c
create mode 100644 lib/librte_eal/common/malloc_mp.h
create mode 100644 lib/librte_eal/linuxapp/eal/eal_memalloc.c
--
2.7.4
More information about the dev
mailing list