[dpdk-dev] rte_mbuf.next in 2nd cacheline

Bruce Richardson bruce.richardson at intel.com
Mon Jun 15 16:12:58 CEST 2015


On Mon, Jun 15, 2015 at 04:05:05PM +0200, Olivier MATZ wrote:
> Hi,
> 
> On 06/15/2015 03:54 PM, Ananyev, Konstantin wrote:
> > 
> > 
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> >> Sent: Monday, June 15, 2015 2:44 PM
> >> To: Olivier MATZ
> >> Cc: dev at dpdk.org; Damjan Marion (damarion)
> >> Subject: Re: [dpdk-dev] rte_mbuf.next in 2nd cacheline
> >>
> >> On Mon, Jun 15, 2015 at 03:20:22PM +0200, Olivier MATZ wrote:
> >>> Hi Damjan,
> >>>
> >>> On 06/10/2015 11:47 PM, Damjan Marion (damarion) wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> We noticed 7% performance improvement by simply moving rte_mbuf.next field to the 1st cache line.
> >>>>
> >>>> Currently, it falls under /* second cache line - fields only used in slow path or on TX */
> >>>> but it is actually used at several places in rx fast path. (e.g.: i40e_rx_alloc_bufs() is setting that field to NULL).
> >>>>
> >>>> Is there anything we can do here (stop using next field, or move it to 1st cache line)?
> >>>
> >>> Agree, this is also something I noticed, see:
> >>> http://dpdk.org/ml/archives/dev/2015-February/014400.html
> >>>
> >>> I did not have the time to do performance testing, but it's something
> >>> I'd like to do as soon as I can. I don't see any obvious reason not to
> >>> do it.
> >>>
> >>> It seems we currently just have enough room to do it (8 bytes are
> >>> remaining in the first cache line when compiled in 64 bits).
> >>
> >> This, to me, is the obvious reason not to do it! It prevents us from taking in
> >> any other offload fields in the RX fast-path into the mbuf.
> >>
> >> That being said, I can see why we might want to look to move it - but from the
> >> work done in the ixgbe driver, I'd be hopeful we can get as much performance with
> >> it on the second cache line for most cases, through judicious use of prefetching,
> >> or otherwise.
> >>
> >> It took a lot of work and investigation to get free space in the mbuf - especially
> >> in cache line 0, and I'd like to avoid just filling the cache line up again as
> >> long as we possibly can!
> > 
> > Yep, agree with Bruce here.
> > Plus, with packet_type going to be 4B and vlan_tci_outer,
> > we just don't have 8 free bytes at the first cache line any more.
> 
> I don't understand why m->next would not be a better candidate than
> rx offload fields to be in the first cache line. For instance, m->next
> is mandatory and must be initialized when allocating a mbuf (to be
> compared with m->seqn for instance, which is also in the first cache
> line). So if we want to do some room in the first cache line, I
> think we can.
> 
> To me, the only reason for not doing it now is because we don't
> have a full performance test report (several use-cases, drivers, ...)
> that shows it's better.
>
Because the "next" field is not mandatory to be set on initialization. It can
instead be set only when needed, and cleared on free if it is used.

The next pointers always start out as NULL when the mbuf pool is created. The
only time it is set to non-NULL is when we have chained mbufs. If we never have
any chained mbufs, we never need to touch the next field, or even read it - since
we have the num-segments count in the first cache line. If we do have a multi-segment
mbuf, it's likely to be a big packet, so we have more processing time available
and we can then take the hit of setting the next pointer. Whenever we go to
free that mbuf for that packet, the code to do the freeing obviously needs to
read the next pointer so as to free all the buffers in the chain, and so it can
also reset the next pointer to NULL when doing so.

In this way, we can ensure that the next pointer on cache line 1 is not a problem
in our fast path.

/Bruce


More information about the dev mailing list