[dpdk-dev] [PATCH 0/9] mbuf: structure reorganization

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Mar 31 11:58:08 CEST 2017


Hi Olivier,

> 
> On Thu, 30 Mar 2017 18:06:35 +0000, "Ananyev, Konstantin" <konstantin.ananyev at intel.com> wrote:
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Thursday, March 30, 2017 5:48 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Richardson, Bruce <bruce.richardson at intel.com>; Olivier Matz
> > > <olivier.matz at 6wind.com>
> > > Cc: dev at dpdk.org; mb at smartsharesystems.com; Chilikin, Andrey <andrey.chilikin at intel.com>; jblunck at infradead.org;
> > > nelio.laranjeiro at 6wind.com; arybchenko at solarflare.com
> > > Subject: RE: [dpdk-dev] [PATCH 0/9] mbuf: structure reorganization
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > Sent: Thursday, March 30, 2017 5:45 PM
> > > > To: Richardson, Bruce <bruce.richardson at intel.com>; Olivier Matz <olivier.matz at 6wind.com>
> > > > Cc: dev at dpdk.org; mb at smartsharesystems.com; Chilikin, Andrey <andrey.chilikin at intel.com>; jblunck at infradead.org;
> > > > nelio.laranjeiro at 6wind.com; arybchenko at solarflare.com
> > > > Subject: Re: [dpdk-dev] [PATCH 0/9] mbuf: structure reorganization
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Richardson, Bruce
> > > > > Sent: Thursday, March 30, 2017 1:23 PM
> > > > > To: Olivier Matz <olivier.matz at 6wind.com>
> > > > > Cc: dev at dpdk.org; Ananyev, Konstantin <konstantin.ananyev at intel.com>; mb at smartsharesystems.com; Chilikin, Andrey
> > > > > <andrey.chilikin at intel.com>; jblunck at infradead.org; nelio.laranjeiro at 6wind.com; arybchenko at solarflare.com
> > > > > Subject: Re: [dpdk-dev] [PATCH 0/9] mbuf: structure reorganization
> > > > >
> > > > > On Thu, Mar 30, 2017 at 02:02:36PM +0200, Olivier Matz wrote:
> > > > > > On Thu, 30 Mar 2017 10:31:08 +0100, Bruce Richardson <bruce.richardson at intel.com> wrote:
> > > > > > > On Wed, Mar 29, 2017 at 09:09:23PM +0100, Bruce Richardson wrote:
> > > > > > > > On Wed, Mar 29, 2017 at 05:56:29PM +0200, Olivier Matz wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Does anyone have any other comment on this series?
> > > > > > > > > Can it be applied?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Olivier
> > > > > > > > >
> > > > > > > >
> > > > > > > > I assume all driver maintainers have done performance analysis to check
> > > > > > > > for regressions. Perhaps they can confirm this is the case.
> > > > > > > >
> > > > > > > > 	/Bruce
> > > > > > > > >
> > > > > > > In the absence, of anyone else reporting performance numbers with this
> > > > > > > patchset, I ran a single-thread testpmd test using 2 x 40G ports (i40e)
> > > > > > > driver. With RX & TX descriptor ring sizes of 512 or above, I'm seeing a
> > > > > > > fairly noticable performance drop. I still need to dig in more, e.g. do
> > > > > > > an RFC2544 zero-loss test, and also bisect the patchset to see what
> > > > > > > parts may be causing the problem.
> > > > > > >
> > > > > > > Has anyone else tried any other drivers or systems to see what the perf
> > > > > > > impact of this set may be?
> > > > > >
> > > > > > I did, of course. I didn't see any noticeable performance drop on
> > > > > > ixgbe (4 NICs, one port per NIC, 1 core). I can replay the test with
> > > > > > current version.
> > > > > >
> > > > > I had no doubt you did some perf testing! :-)
> > > > >
> > > > > Perhaps the regression I see is limited to i40e driver. I've confirmed I
> > > > > still see it with that driver in zero-loss tests, so next step is to try
> > > > > and localise what change in the patchset is causing it.
> > > > >
> > > > > Ideally, though, I think we should see acks or other comments from
> > > > > driver maintainers at least confirming that they have tested. You cannot
> > > > > be held responsible for testing every DPDK driver before you submit work
> > > > > like this.
> > > > >
> > > >
> > > > Unfortunately  I also see a regression.
> > > > Did a quick flood test on 2.8 GHZ IVB with 4x10Gb.
> > >
> > > Sorry, forgot to mention - it is on ixgbe.
> > > So it doesn't look like i40e specific.
> > >
> > > > Observed a drop even with default testpmd RXD/TXD numbers (128/512):
> > > > from 50.8 Mpps down to 47.8 Mpps.
> > > > From what I am seeing the particular patch that causing it:
> > > > [dpdk-dev,3/9] mbuf: set mbuf fields while in pool
> > > >
> > > > cc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)
> > > > cmdline:
> > > > ./dpdk.org-1705-mbuf1/x86_64-native-linuxapp-gcc/app/testpmd  --lcores='7,8'  -n 4 --socket-mem='1024,0'  -w 04:00.1 -w 07:00.1 -
> w
> > > > 0b:00.1 -w 0e:00.1 -- -i
> > > >
> >
> > Actually one more question regarding:
> > [dpdk-dev,9/9] mbuf: reorder VLAN tci and buffer len fields
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index fd97bd3..ada98d5 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -449,8 +449,7 @@  struct rte_mbuf {
> >
> >  	uint32_t pkt_len;         /**< Total pkt len: sum of all segments. */
> >  	uint16_t data_len;        /**< Amount of data in segment buffer. */
> > -	/** VLAN TCI (CPU order), valid if PKT_RX_VLAN_STRIPPED is set. */
> > -	uint16_t vlan_tci;
> > +	uint16_t buf_len;         /**< Size of segment buffer. */
> >
> >  	union {
> >  		uint32_t rss;     /**< RSS hash result if RSS enabled */
> > @@ -475,11 +474,11 @@  struct rte_mbuf {
> >  		uint32_t usr;	  /**< User defined tags. See rte_distributor_process() */
> >  	} hash;                   /**< hash information */
> >
> > +	/** VLAN TCI (CPU order), valid if PKT_RX_VLAN_STRIPPED is set. */
> > +	uint16_t vlan_tci;
> >  	/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ_STRIPPED is set. */
> >  	uint16_t vlan_tci_outer;
> >
> > -	uint16_t buf_len;         /**< Length of segment buffer. */
> > -
> >  	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
> >  	 * are not normalized but are always the same for a given port.
> >  	 */
> >
> > How ixgbe and i40e SSE version supposed to work correctly after  that change?
> > As I remember both of them sets vlan_tci as part of 16B shuffle operation.
> > Something like that:
> > pkt_mb4 = _mm_shuffle_epi8(descs[3], shuf_msk);
> > ...
> > mm_storeu_si128((void *)&rx_pkts[pos+3]->rx_descriptor_fields1,
> >                                  pkt_mb4);
> >
> > But now vlan_tci is swapped with buf_len.
> > Which means 2 things to me:
> > It is more than 16B away from rx_descriptor_fields1 and can't be updated in one go anymore,
> > and instead of vlan_tci we are updating buf_len.
> 
> 
> Sorry, I missed it. But this shows something problematic: changing the
> order of fields in a structure breaks code without notification. I think
> that drivers expecting a field at a specific position should have some
> BUG_ON() to check that the condition is still valid. We can't expect anyone
> to know all the constraints of all vectors PMDs in DPDK.
> 
> The original idea of this patch was to group vlan_tci and vlan_outer_tci,
> which looked to be a good idea at first glance. If it requires to change
> all vector code, let's drop it.
> 
> Just for the exercice, let's imagine we need that patch. What would be
> the procedure to have it integrated? How can we detect there is an issue?
> Who would be in charge of modifying all the vector code in PMDs?
> 

Indeed right now there is no way to know what is PMD requirement on mbuf layout.
Adding BUG_ON() into particular RX/TX implementation that has such constrains seems
like a very good idea to me.
Apart from that I don't know off-hand how we can make restructuring mbuf less painful.
Konstantin
 



More information about the dev mailing list