[dpdk-dev] [PATCH 1/2] net/bonding: support Tx prepare for bonding

Ananyev, Konstantin konstantin.ananyev at intel.com
Wed Jun 9 12:25:22 CEST 2021


> > On 4/23/21 12:46 PM, Chengchang Tang wrote:
> >> To use the HW offloads capability (e.g. checksum and TSO) in the Tx
> >> direction, the upper-layer users need to call rte_eth_dev_prepare to do
> >> some adjustment to the packets before sending them (e.g. processing
> >> pseudo headers when Tx checksum offoad enabled). But, the tx_prepare
> >> callback of the bond driver is not implemented. Therefore, related
> >> offloads can not be used unless the upper layer users process the packet
> >> properly in their own application. But it is bad for the
> >> transplantability.
> >>
> >> However, it is difficult to design the tx_prepare callback for bonding
> >> driver. Because when a bonded device sends packets, the bonded device
> >> allocates the packets to different slave devices based on the real-time
> >> link status and bonding mode. That is, it is very difficult for the
> >> bonding device to determine which slave device's prepare function should
> >> be invoked. In addition, if the link status changes after the packets are
> >> prepared, the packets may fail to be sent because packets allocation may
> >> change.
> >>
> >> So, in this patch, the tx_prepare callback of bonding driver is not
> >> implemented. Instead, the rte_eth_dev_tx_prepare() will be called for
> >> all the fast path packet in mode 0, 1, 2, 4, 5, 6. In this way, all
> >> tx_offloads can be processed correctly for all NIC devices in these modes.
> >> If tx_prepare is not required in some cases, then slave PMDs tx_prepare
> >> pointer should be NULL and rte_eth_tx_prepare() will be just a NOOP.
> >> In these cases, the impact on performance will be very limited. It is
> >> the responsibility of the slave PMDs to decide when the real tx_prepare
> >> needs to be used. The information from dev_config/queue_setup is
> >> sufficient for them to make these decisions.
> >>
> >> Note:
> >> The rte_eth_tx_prepare is not added to bond mode 3(Broadcast). This is
> >> because in broadcast mode, a packet needs to be sent by all slave ports.
> >> Different PMDs process the packets differently in tx_prepare. As a result,
> >> the sent packet may be incorrect.
> >>
> >> Signed-off-by: Chengchang Tang <tangchengchang at huawei.com>
> >> ---
> >>  drivers/net/bonding/rte_eth_bond.h     |  1 -
> >>  drivers/net/bonding/rte_eth_bond_pmd.c | 28 ++++++++++++++++++++++++----
> >>  2 files changed, 24 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/net/bonding/rte_eth_bond.h b/drivers/net/bonding/rte_eth_bond.h
> >> index 874aa91..1e6cc6d 100644
> >> --- a/drivers/net/bonding/rte_eth_bond.h
> >> +++ b/drivers/net/bonding/rte_eth_bond.h
> >> @@ -343,7 +343,6 @@ rte_eth_bond_link_up_prop_delay_set(uint16_t bonded_port_id,
> >>  int
> >>  rte_eth_bond_link_up_prop_delay_get(uint16_t bonded_port_id);
> >>
> >> -
> >>  #ifdef __cplusplus
> >>  }
> >>  #endif
> >> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c
> >> index 2e9cea5..84af348 100644
> >> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> >> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> >> @@ -606,8 +606,14 @@ bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
> >>  	/* Send packet burst on each slave device */
> >>  	for (i = 0; i < num_of_slaves; i++) {
> >>  		if (slave_nb_pkts[i] > 0) {
> >> +			int nb_prep_pkts;
> >> +
> >> +			nb_prep_pkts = rte_eth_tx_prepare(slaves[i],
> >> +					bd_tx_q->queue_id, slave_bufs[i],
> >> +					slave_nb_pkts[i]);
> >> +
> >
> > Shouldn't it be called iff queue Tx offloads are not zero?
> > It will allow to decrease performance degradation if no
> > Tx offloads are enabled. Same in all cases below.
> 
> Regarding this point, it has been discussed in the previous RFC:
> https://inbox.dpdk.org/dev/47f907cf-3933-1de9-9c45-6734b912eccd@huawei.com/
> 
> According to the TX_OFFLOAD status of the current device, PMDs can determine
> whether tx_prepare is currently needed. If it is not needed, set pkt_tx_prepare
> to NULL, so that the actual tx_prepare processing will be skipped directly in
> rte_eth_tx_prepare().
> 
> >
> >>  			num_tx_slave = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
> >> -					slave_bufs[i], slave_nb_pkts[i]);
> >> +					slave_bufs[i], nb_prep_pkts);
> >
> > In fact it is a problem here and really big problems.
> > Tx prepare may fail and return less packets. Tx prepare
> > of some packet may always fail. If application tries to
> > send packets in a loop until success, it will be a
> > forever loop here. Since application calls Tx burst,
> > it is 100% legal behaviour of the function to return 0
> > if Tx ring is full. It is not an error indication.
> > However, in the case of Tx prepare it is an error
> > indication.

Yes, that sounds like a problem and existing apps might be affected.

> >
> > Should we change Tx burst description and enforce callers
> > to check for rte_errno? It sounds like a major change...
> >

Agree, rte_errno for tx_burst() is probably a simplest and sanest way,
but yes, it is a change in behaviour and apps will need to be updated.  
Another option for bond PMD - just silently free mbufs for which prepare()
fails (and probably update some stats counter).
Again it is a change in behaviour, but now just for one PMD, with tx offloads enabled.
Also as, I can see some tx_burst() function for that PMD already free packets silently:
bond_ethdev_tx_burst_alb(), bond_ethdev_tx_burst_broadcast().

Actually another question - why the patch adds tx_prepare() only to some
TX modes but not all?
Is that itended? 

> 
> I agree that if the failure is caused by Tx ring full, it is a legal behaviour.
> But what about the failure caused by other reasons? At present, it is possible
> for some PMDs to fail during tx_burst due to other reasons. In this case,
> repeated tries to send will also fail.
> 
> I'm not sure if all PMDs need to support the behavior of sending packets in a
> loop until it succeeds. If not, I think the current problem can be reminded to
> the user by adding a description to the bonding. If it is necessary, I think the
> description of tx_burst should also add related instructions, so that the developers
> of PMDs can better understand how tx_burst should be designed, such as putting all
> hardware-related constraint checks into tx_prepare. And another prerequisite for
> the above behavior is that the packets must be prepared (i.e. checked by
> rte_eth_tx_prepare()). Otherwise, it may also fail to send. This means that we have
> to use rte_eth_tx_prepare() in more scenarios.
> 
> What's Ferruh's opinion on this?
> 
> > [snip]
> >
> > .
> >



More information about the dev mailing list