[PATCH v2 00/71] replace use of fixed size rte_mempcy

Morten Brørup mb at smartsharesystems.com
Sat Mar 2 18:32:25 CET 2024


> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, 2 March 2024 17.38
> 
> On Sat, 2 Mar 2024 14:05:45 +0100
> Morten Brørup <mb at smartsharesystems.com> wrote:
> 
> > >
> > > > My experience with replacing rte_memcpy() with memcpy() (or vice
> > > versa)
> > > > is mixed.
> > > >
> > > > I've also tried just dropping the DPDK-custom memcpy()
> implementation
> > > > altogether, and that caused a performance drop (in a particular
> app,
> > > on
> > > > a particular compiler and CPU).
> >
> > I guess the compilers are just not where we want them to be yet.
> >
> > I don't mind generally replacing rte_memcpy() with memcpy() in the
> control plane.
> > But we should use whatever is more efficient in the data plane.
> >
> > We must also keep in mind that DPDK supports old distros with old
> compilers. We should not remove a superfluous hand crafted optimization
> if a supported old compiler hasn't caught up with it yet, i.e. if it
> isn't superfluous on some of the old compilers supported by DPDK.
> 
> When I scanned the result.
>         1. Most copies were small (like Ether address or IPv6 address)
>            and compiler
>            inlining should beat a function call every time.

Please note that rte_memcpy() is inline, so no function call is involved.

>         2. Larger structure copies were in control path.

Yep, I saw the same two things when scanning v1 of the series before acking it.
If we didn't overlook any fast path copies, this series is a good clean-up

I must admit that I assume that any compiler's built-in memcpy() is able to efficiently copy small structures of build time constant size.
Assumptions are the mother of all FU's, but being wrong on this would be a very big surprise to me.



More information about the dev mailing list