[dpdk-dev] mbuff rearm_data aligmenet issue on non x86

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu May 12 15:14:34 CEST 2016


> 
> On Thu, May 12, 2016 at 10:07:09AM +0000, Ananyev, Konstantin wrote:
> > Hi Jerrin,
> >
> > >
> > > Hi All,
> > >
> > > I would like align mbuff rearm_data field to 8 byte aligned so that
> > > write to mbuf->rearm_data with uint64_t* will be naturally aligned.
> > > I am not sure about IA but some other architecture/implementation has overhead
> > > in non-naturally aligned stores.
> > >
> > > Proposed patch is something like this below, But open for any change to
> > > make fit for all other architectures/platform.
> > >
> > > Any thoughts ?
> > >
> > > ➜ [master] [dpdk-master] $ git diff
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 529debb..5a917d0 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -733,10 +733,8 @@ struct rte_mbuf {
> > >         void *buf_addr;           /**< Virtual address of segment
> > > buffer. */
> > >         phys_addr_t buf_physaddr; /**< Physical address of segment
> > > buffer. */
> > >
> > > -       uint16_t buf_len;         /**< Length of segment buffer. */
> > > -
> >
> >
> > There is no need to move buf_len itself, I think.
> > Just move rearm_data marker prior to buf_len is enough.
> > Though how do you suggest to deal with the fact, that right now we blindly
> > update the whole 64bits pointed by rearm_data:
> >
> > drivers/net/ixgbe/ixgbe_rxtx_vec.c:
> > 	/*
> >                  * Flush mbuf with pkt template.
> >                  * Data to be rearmed is 6 bytes long.
> >                  * Though, RX will overwrite ol_flags that are coming next
> >                  * anyway. So overwrite whole 8 bytes with one load:
> >                  * 6 bytes of rearm_data plus first 2 bytes of ol_flags.
> >                  */
> >                 p0 = (uintptr_t)&mb0->rearm_data;
> >                 *(uint64_t *)p0 = rxq->mbuf_initializer;
> >
> > ?
> >
> > If buf_len will be inside these 64bits, we can't do it anymore.
> >
> > Are you suggesting something like:
> >
> > uint64_t *p0, v0;
> >
> > p0 = &mb0->rearm_data;
> > v0 = *p0 & REARM_MASK;
> > *p0 = v0 | rxq->mbuf_initializer;
> > ?
> 
> Due to unaligned rearm_data issue, In ThunderX platform, we need to write
> multiple half word of aligned stores(so masking was better us).

Ok, so what would be the gain on ARM if you'll make that change?
Again, what would be the drop (if any) on IA?

> But I think, if we can put 16bit hole between port and ol_flags then
> we may not need the masking stuff in ixgbe. Right?

You mean move buf_len somewhere else (end of cacheline0) and 
introduce a 2B hole between port and ol_flags, right?
Yep, that probably wouldn't have any performance impact.

> 
> OR
> 
> Even better, if we can fill in a uint16_t variable which will replaced
> later in the flow like "data_len"?

data_len is grouped  with rx_descriptor_fields1 on purpose -
so we can update packet_type,pkt_len, data_len,vlan_tci,rss with one 16B write.

Konstantin

> and move buf_len at end the first  cache line? 
>or any other thoughts to fix unaligned rearm_data issue?
> 
> Jerin
> 
> 
> 
> >
> > If so I wonder what would be the performance impact of that change.
> > Konstantin
> >
> >
> > >         /* next 6 bytes are initialised on RX descriptor rearm */
> > > -       MARKER8 rearm_data;
> > > +       MARKER64 rearm_data;
> > >         uint16_t data_off;
> > >
> > >         /**
> > > @@ -754,6 +752,7 @@ struct rte_mbuf {
> > >         uint8_t nb_segs;          /**< Number of segments. */
> > >         uint8_t port;             /**< Input port. */
> > >
> > > +       uint16_t buf_len;         /**< Length of segment buffer. */
> > >         uint64_t ol_flags;        /**< Offload features. */
> > >
> > >         /* remaining bytes are set on RX when pulling packet from
> > >  * descriptor
> > >
> > > /Jerin


More information about the dev mailing list