[PATCH v2 3/4] doc: give specific instructions for running as non-root

Bruce Richardson bruce.richardson at intel.com
Fri Jun 17 18:38:19 CEST 2022


On Fri, Jun 17, 2022 at 02:25:07PM +0300, Dmitry Kozlyuk wrote:
> The guide to run DPDK applications as non-root in Linux
> did not provide specific instructions to configure the required access
> and did not explain why each bit is needed.
> The latter is important because running as non-root
> is one of the ways to tighten security and grant minimal permissions.
> 
> Cc: stable at dpdk.org
> 
> Signed-off-by: Dmitry Kozlyuk <dkozlyuk at nvidia.com>

Thanks for this, some good changes here. Comments inline below.

/Bruce

> ---
>  doc/guides/linux_gsg/enable_func.rst          | 67 +++++++++++++++++--
>  .../prog_guide/env_abstraction_layer.rst      |  2 +
>  2 files changed, 63 insertions(+), 6 deletions(-)
> 
> diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst
> index 1df3ab0255..2f908e8b70 100644
> --- a/doc/guides/linux_gsg/enable_func.rst
> +++ b/doc/guides/linux_gsg/enable_func.rst
> @@ -13,13 +13,58 @@ Enabling Additional Functionality
>  Running DPDK Applications Without Root Privileges
>  -------------------------------------------------
>  
> -In order to run DPDK as non-root, the following Linux filesystem objects'
> -permissions should be adjusted to ensure that the Linux account being used to
> -run the DPDK application has access to them:
> +The following sections describe generic requirements and configuration
> +for running DPDK applications as non-root.
> +There may be additional requirements documented for some drivers.
>  
> -*   All directories which serve as hugepage mount points, for example, ``/dev/hugepages``
> +Hugepages
> +~~~~~~~~~
>  
> -*   If the HPET is to be used,  ``/dev/hpet``
> +Hugepages must be reserved as root before runing the application as non-root,
> +for example::
> +
> +  sudo dpdk-hugepages.py --reserve 1G
> +
> +If multi-process is not required, running with ``--in-memory``
> +bypasses the need to access hugepage mount point and files within it.
> +Otherwise, hugepage directory must be made accessible
> +for writing to the unprivileged user.
> +A good way for managing multiple applications using hugepages
> +is to mount the filesystem with group permissions
> +and add a supplementary group to each application or container.
> +
> +One option is to use the script provided by this project::
> +
> +  export HUGEDIR=$HOME/huge-1G
> +  mkdir -p $HUGEDIR
> +  sudo dpdk-hugepages.py --mount --directory $HUGEDIR --owner `id -u`:`id -g`
> +
> +In production environment, the OS can manage mount points
> +(`systemd example <https://github.com/systemd/systemd/blob/main/units/dev-hugepages.mount>`_).
> +
> +The ``hugetlb`` filesystem has additional options to guarantee or limit
> +the amount of memory that is possible to allocate using the mount point.
> +Refer to the `documentation <https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt>`_.
> +
> +If the driver requires using physical addresses (PA),
> +the executable file must be granted additional capabilities:
> +
> +* ``SYS_ADMIN`` to read ``/proc/self/pagemaps``
> +* ``IPC_LOCK`` to lock hugepages in memory

Are either of these necessary if using vfio-pci and VA mode? I have seen it
previously reported that IPC_LOCK is necessary for IOMMU memory mapping for
DMA - at least for docker containers - so I'd like it confirmed that we
don't need them in the in-memory case running on the host. If I get the
chance I'll try double-checking by testing myself.

> +
> +.. code-block:: console
> +
> +   setcap cap_ipc_lock,cap_sys_admin+ep <executable>
> +
> +If physical addresses are not accessible,
> +the following message will appear during EAL initialization::
> +
> +  EAL: rte_mem_virt2phy(): cannot open /proc/self/pagemap: Permission denied
> +
> +It is harmless in case PA are not needed.
> +

While this is probably worth having in the doc, I think we should really
include a note here about using vfio-pci rather than uio and therefore not
needing physical addresses.

> +Resource Limits
> +~~~~~~~~~~~~~~~
>  
>  When running as non-root user, there may be some additional resource limits
>  that are imposed by the system. Specifically, the following resource limits may
> @@ -34,7 +79,15 @@ need to be adjusted in order to ensure normal DPDK operation:
>  The above limits can usually be adjusted by editing
>  ``/etc/security/limits.conf`` file, and rebooting.
>  
> -Additionally, depending on which kernel driver is in use, the relevant
> +See `Hugepage Mapping <hugepage_mapping>`_
> +secton to learn how these limits affect EAL.

Typo: s/secton/section/

> +
> +Device Control
> +~~~~~~~~~~~~~~
> +
> +If the HPET is to be used, ``/dev/hpet`` permissions must be adjusted.
> +

Given that HPET has been off by default for years, I think we can probably
remove this line. Anyone still using it likely already knows this.

> +Depending on which kernel driver is in use, the relevant
>  resources also should be accessible by the user running the DPDK application.
>  
>  For ``vfio-pci`` kernel driver, the following Linux file system objects'
> @@ -64,6 +117,8 @@ system objects' permissions should be adjusted:
>         /sys/class/uio/uio0/device/config
>         /sys/class/uio/uio0/device/resource*
>  

I think our minimum supported kernel version is now >4.0 so I believe this
uio section should be removed as it's only applicable for earlier kernel
versions.

> +For ``virtio`` PMD in legacy mode, ``SYS_RAWIO`` capability is required
> +for ``iopl()`` call to enable access to PCI IO ports.
>  

How "legacy" is legacy-mode? Is it still likely in widespread use that we
need this?

>  Power Management and Power Saving Functionality
>  -----------------------------------------------
> diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
> index 5f0748fba1..70fa099d30 100644
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> @@ -228,6 +228,8 @@ Normally, these options do not need to be changed.
>      can later be mapped into that preallocated VA space (if dynamic memory mode
>      is enabled), and can optionally be mapped into it at startup.
>  
> +.. _hugepage_mapping:
> +
>  Hugepage Mapping
>  ^^^^^^^^^^^^^^^^
>  
> -- 
> 2.25.1
> 


More information about the stable mailing list