[dpdk-stable] [dpdk-dev] [PATCH] mempool: fix mempool obj alignment for non x86
Honnappa Nagarahalli
Honnappa.Nagarahalli at arm.com
Fri Dec 20 22:07:03 CET 2019
<snip>
> > > From: Jerin Jacob <jerinj at marvell.com>
> > >
> > > The exiting optimize_object_size() function address the memory
> > > object alignment constraint on x86 for better performance.
> > >
> > > Different (Mirco) architecture may have different memory alignment
> > > constraint for better performance and it not same as the existing
> > > optimize_object_size() function. Some use, XOR(kind of CRC) scheme
> > > to enable DRAM channel distribution based on the address and some
> > > may have a different formula.
> > If I understand correctly, address interleaving is the characteristic of the
> memory controller and not the CPU.
> > For ex: different SoCs using the same Arm architecture might have different
> memory controllers. So, the solution should not be architecture specific, but
> SoC specific.
>
> Yes. See below.
>
> > > -static unsigned optimize_object_size(unsigned obj_size)
> > > +static unsigned
> > > +arch_mem_object_align(unsigned obj_size)
> > > {
> > > unsigned nrank, nchan;
> > > unsigned new_obj_size;
> > > @@ -99,6 +101,13 @@ static unsigned optimize_object_size(unsigned
> > > obj_size)
> > > new_obj_size++;
> > > return new_obj_size * RTE_MEMPOOL_ALIGN; }
> > > +#else
> > This applies to add Arm (PPC as well) SoCs which might have different
> schemes depending on the memory controller. IMO, this should not be
> architecture specific.
>
> I agree in principle.
> I will summarize the
> https://www.mail-archive.com/dev@dpdk.org/msg149157.html feedback:
>
> 1) For x86 arch, it is architecture-specific
> 2) For power PC arch, It is architecture-specific
> 3) For the ARM case, it will be the memory controller specific.
> 4) For the ARM case, The memory controller is not using the existing
> x86 arch formula.
> 5) If it is memory/arch-specific, Can userspace code find the optimal
> alignment? In the case of octeontx2/arm64, the memory controller does XOR
> on PA address which userspace code doesn't have much control.
>
> This patch address the known case of (1), (2), (4) and (5). (2) can be added to
> this framework when POWER9 folks want it.
>
> We can extend this patch to address (3) if there is a case. Without the actual
> requirement(If some can share the formula of alignment which is the
> memory controller specific and it does not come under (4))) then we can
> create extra layer for the memory controller and abstraction to probe it.
> Again there is no standard way of probing the memory controller in
> userspace and we need platform #define, which won't work for distribution
> build.
> So solution needs to be arch-specific and then fine-tune to memory controller
> if possible.
>
> I can work on creating an extra layer of code if some can provide the details
> of the memory controller and probing mechanism or this patch be extended
Inputs for BlueField, DPAAx, ThunderX2 would be helpful.
> to support such case if it arises in future.
>
> Thoughts?
How much memory will this save for your platform? Is it affecting performance?
>
> >
> > > +static unsigned
> > > +arch_mem_object_align(unsigned obj_size) {
> > > + return obj_size;
> > > +}
> > > +#endif
> > >
> > > struct pagesz_walk_arg {
> > > int socket_id;
> > > @@ -234,8 +243,8 @@ rte_mempool_calc_obj_size(uint32_t elt_size,
> > > uint32_t flags,
> > > */
> > > if ((flags & MEMPOOL_F_NO_SPREAD) == 0) {
> > > unsigned new_size;
> > > - new_size = optimize_object_size(sz->header_size + sz-
> > > >elt_size +
> > > - sz->trailer_size);
> > > + new_size = arch_mem_object_align
> > > + (sz->header_size + sz->elt_size +
> > > + sz->trailer_size);
> > > sz->trailer_size = new_size - sz->header_size - sz->elt_size;
> > > }
> > >
> > > --
> > > 2.24.1
> >
More information about the stable
mailing list