[dpdk-dev] [PATCH] net/mlx5: remmap UAR address for multiple process

Xueming(Steven) Li xuemingl at mellanox.com
Tue Jan 23 10:50:42 CET 2018


Hi Nelio,

> -----Original Message-----
> From: Nélio Laranjeiro [mailto:nelio.laranjeiro at 6wind.com]
> Sent: Monday, January 22, 2018 10:53 PM
> To: Xueming(Steven) Li <xuemingl at mellanox.com>
> Cc: Shahaf Shuler <shahafs at mellanox.com>; dev at dpdk.org
> Subject: Re: [PATCH] net/mlx5: remmap UAR address for multiple process
> 
> Hi Xueming,
> 
> On Fri, Jan 19, 2018 at 11:08:54PM +0800, Xueming Li wrote:
> > UAR(doorbell) is hw resources that have to be same address between
> > primary and secondary process, failed to mmap UAR will make TX packets
> > invisible to HW.
> > Today, UAR address returned from verbs api is mixed in heap and loaded
> > library address space, prone to be occupied in secondary process.
> > This patch reserves a dedicate UAR address space, both primary and
> > secondary process re-mmap UAR pages into this space.
> > Below is a brief picture of dpdk app address space allocation:
> > 	Before			This patch
> > 	------			----------
> > 	[stack]			[stack]
> > 	[.so, uar, heap]	[.so, heap]
> > 	[(empty)]		[(empty)]
> > 	[hugepage]		[hugepage]
> > 	[? others]		[? others]
> > 	[(empty)]		[(empty)]
> > 				[uar]
> > 				[(empty)]
> > To minimize conflicts, UAR address space comes after hugepage space
> > with an offset to skip potential usage from other drivers.
> 
> Seems it is not the case when the memory is contiguous, according to what
> I see in my testpmd /proc/<pid>/maps:
> 
>  PMD: mlx5.c:523: mlx5_uar_init_primary(): Reserved UAR address space:
> 0x0x7f4da5800000
> 
> And the fist huge page is at address 0x7f4fa5800000, new UAR space is
> before and not after.
> 
> With this patch I still have the situation described as "before".
> 

Your observation is correct, system is allocating address in a high-to-low
manner like stack. UAR address range 0x0x7f4da5800000 - 0x0x7f4ea5800000, 
4GB size, With another 4G offset, hugepage range start is 0x7f4fa5800000.

> > Once UAR space reserved successfully, UAR pages are re-mmapped into
> > new area to keep UAR address aligned between primary and secondary
> process.
> >
> > Signed-off-by: Xueming Li <xuemingl at mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5.c         | 107
> ++++++++++++++++++++++++++++++++++++++++
> >  drivers/net/mlx5/mlx5.h         |   1 +
> >  drivers/net/mlx5/mlx5_defs.h    |  10 ++++
> >  drivers/net/mlx5/mlx5_rxtx.h    |   3 +-
> >  drivers/net/mlx5/mlx5_trigger.c |   7 ++-
> >  drivers/net/mlx5/mlx5_txq.c     |  51 +++++++++++++------
> >  6 files changed, 163 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > fc2d59fee..1539ef608 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -39,6 +39,7 @@
> >  #include <stdlib.h>
> >  #include <errno.h>
> >  #include <net/if.h>
> > +#include <sys/mman.h>
> >
> >  /* Verbs header. */
> >  /* ISO C doesn't support unnamed structs/unions, disabling -pedantic.
> > */ @@ -56,6 +57,7 @@  #include <rte_pci.h>  #include <rte_bus_pci.h>
> > #include <rte_common.h>
> > +#include <rte_eal_memconfig.h>
> >  #include <rte_kvargs.h>
> >
> >  #include "mlx5.h"
> > @@ -466,6 +468,101 @@ mlx5_args(struct mlx5_dev_config *config, struct
> > rte_devargs *devargs)
> >
> >  static struct rte_pci_driver mlx5_driver;
> >
> > +/*
> > + * Reserved UAR address space for TXQ UAR(hw doorbell) mapping,
> > +process
> > + * local resource used by both primary and secondary to avoid
> > +duplicate
> > + * reservation.
> > + * The space has to be available on both primary and secondary
> > +process,
> > + * TXQ UAR maps to this area using fixed mmap w/o double check.
> > + */
> > +static void *uar_base;
> > +
> > +/**
> > + * Reserve UAR address space for primary process
> > + *
> > + * @param[in] priv
> > + *   Pointer to private structure.
> > + *
> > + * @return
> > + *   0 on success, negative errno value on failure.
> > + */
> > +static int
> > +mlx5_uar_init_primary(struct priv *priv) {
> > +	void *addr = (void *)0;
> > +	int i;
> > +	const struct rte_mem_config *mcfg;
> > +
> > +	if (uar_base) { /* UAR address space mapped */
> > +		priv->uar_base = uar_base;
> > +		return 0;
> > +	}
> > +	/* find out lower bound of hugepage segments */
> > +	mcfg = rte_eal_get_configuration()->mem_config;
> > +	for (i = 0; i < RTE_MAX_MEMSEG && mcfg->memseg[i].addr; i++) {
> > +		if (addr)
> > +			addr = RTE_MIN(addr, mcfg->memseg[i].addr);
> > +		else
> > +			addr = mcfg->memseg[i].addr;
> 
> This if/else is useless as addr is already initialised with the smallest
> possible value.

That's my original code :-) and I always get addr zero then. 
Addr here is the lower bound of hugepage, we don't want addr to keep zero.

> 
> > +	}
> > +	/* offset down UAR area */
> > +	addr = RTE_PTR_SUB(addr, MLX5_UAR_OFFSET + MLX5_UAR_SIZE);
> 
> Seems the error is here, the loops get the address of the memseg with the
> smallest address and then it subtract the UAR size, addr cannot be after
> the huge pages unless if this subtraction overflows.

Thanks, my word "after" is something like address alloction order, the UAR block 
under "hugepage" on the overall picture.

> 
> > +	/* anonymous mmap, no real memory consumption */
> > +	addr = mmap(addr, MLX5_UAR_SIZE,
> > +		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> > +	if (addr == MAP_FAILED) {
> > +		ERROR("Failed to reserve UAR address space, please adjust "
> > +		      "MLX5_UAR_SIZE or try --base-virtaddr");
> 
> How does a user knows the UAR memory space the NIC needs to adjust the
> MLX5_UAR_SIZE?
> 
> > +		return -ENOMEM;
> > +	}
> > +	/* Accept either same addr or a new addr returned from mmap if
> target
> > +	 * range occupied.
> > +	 */
> > +	INFO("Reserved UAR address space: 0x%p", addr);
> 
> The '%p' already prefix the address with the 0x.
> 
> > +	priv->uar_base = addr; /* for primary and secondary UAR re-mmap */
> > +	uar_base = addr; /* process local, don't reserve again */
> > +	return 0;
> > +}
> > +
> <snip/>
> 
> Regards,
> 
> --
> Nélio Laranjeiro
> 6WIND


More information about the dev mailing list