[dpdk-stable] [dpdk-dev] [PATCH v4] net/bonding: per-slave intermediate rx ring

Matan Azrad matan at mellanox.com
Wed Aug 22 09:09:11 CEST 2018


Hi Chas

From: Chas Williams
>On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad <mailto:matan at mellanox.com> wrote:
>Hi Chas
>
>From: Chas Williams
>> On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad <mailto:matan at mellanox.com>
>> wrote:
>> Hi
>> 
>> From: Chas Williams
>> > This will need to be implemented for some of the other RX burst
>> > methods at some point for other modes to see this performance
>> > improvement (with the exception of active-backup).
>> 
>> Yes, I think it should be done at least to
>> bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for now.
>> 
>> There is some duplicated code between the various RX paths.
>> I would like to eliminate that as much as possible, so I was going to give that
>> some thought first.
>
>There is no reason to stay this function as is while its twin is changed.
>
>Unfortunately, this is all the patch I have at this time.
> 
>
>> 
>> 
>> > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi <mailto:bluca at debian.org> wrote:
>> >
>> > > During bond 802.3ad receive, a burst of packets is fetched from each
>> > > slave into a local array and appended to per-slave ring buffer.
>> > > Packets are taken from the head of the ring buffer and returned to
>> > > the caller.  The number of mbufs provided to each slave is
>> > > sufficient to meet the requirements of the ixgbe vector receive.
>> 
>> Luca,
>> 
>> Can you explain these requirements of ixgbe?
>> 
>> The ixgbe (and some other Intel PMDs) have vectorized RX routines that are
>> more efficient (if not faster) taking advantage of some advanced CPU
>> instructions.  I think you need to be receiving at least 32 packets or more.
>
>So, why to do it in bond which is a generic driver for all the vendors PMDs,
>If for ixgbe and other Intel nics it is better you can force those PMDs to receive always 32 packets
>and to manage a ring by themselves.
>
>The drawback of the ring is some additional latency on the receive path.
>In testing, the additional latency hasn't been an issue for bonding.

When bonding does processing slower it may be a bottleneck for the packet processing for some application. 

> The bonding PMD has a fair bit of overhead associated with the RX and TX path
>calculations.  Most applications can just arrange to call the RX path
>with a sufficiently large receive.  Bonding can't do this.

I didn't talk on application I talked on the slave PMDs,
The slave PMD can manage a ring by itself if it helps for its own performance.
The bonding should not be oriented to specific PMDs.


>> Did you check for other vendor PMDs? It may hurt performance there..
>> 
>> I don't know, but I suspect probably not.  For the most part you are typically
>> reading almost up to the vector requirement.  But if one slave has just a
>> single packet, then you can't vectorize on the next slave.
>> 
>
>I don't think that the ring overhead is better for PMDs which are not using the vectorized instructions.
>
>The non-vectorized PMDs are usually quite slow.  The additional
>overhead doesn't make a difference in their performance.

We should not do things worse than they are.


 


More information about the stable mailing list