回复: [PATCH v3 2/3] net/i40e: enable direct rearm with separate API

Feifei Wang Feifei.Wang2 at arm.com
Tue Mar 14 07:07:09 CET 2023



> -----邮件原件-----
> 发件人: Konstantin Ananyev <konstantin.ananyev at huawei.com>
> 发送时间: Tuesday, March 7, 2023 7:01 PM
> 收件人: Feifei Wang <Feifei.Wang2 at arm.com>; Konstantin Ananyev
> <konstantin.v.ananyev at yandex.ru>; Yuying Zhang
> <Yuying.Zhang at intel.com>; Beilei Xing <beilei.xing at intel.com>; Ruifeng
> Wang <Ruifeng.Wang at arm.com>
> 抄送: dev at dpdk.org; nd <nd at arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>; nd <nd at arm.com>
> 主题: RE: [PATCH v3 2/3] net/i40e: enable direct rearm with separate API
> 
> 
> 
> > > -----邮件原件-----
> > > 发件人: Konstantin Ananyev <konstantin.ananyev at huawei.com>
> > > 发送时间: Tuesday, February 28, 2023 3:36 AM
> > > 收件人: Feifei Wang <Feifei.Wang2 at arm.com>; Konstantin Ananyev
> > > <konstantin.v.ananyev at yandex.ru>; Yuying Zhang
> > > <Yuying.Zhang at intel.com>; Beilei Xing <beilei.xing at intel.com>;
> > > Ruifeng Wang <Ruifeng.Wang at arm.com>
> > > 抄送: dev at dpdk.org; nd <nd at arm.com>; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>
> > > 主题: RE: [PATCH v3 2/3] net/i40e: enable direct rearm with separate
> > > API
> > >
> > >
> > >
> > > > > > +int
> > > > > > +i40e_tx_fill_sw_ring(void *tx_queue,
> > > > > > +		struct rte_eth_rxq_rearm_data *rxq_rearm_data) {
> > > > > > +	struct i40e_tx_queue *txq = tx_queue;
> > > > > > +	struct i40e_tx_entry *txep;
> > > > > > +	void **rxep;
> > > > > > +	struct rte_mbuf *m;
> > > > > > +	int i, n;
> > > > > > +	int nb_rearm = 0;
> > > > > > +
> > > > > > +	if (*rxq_rearm_data->rearm_nb < txq->tx_rs_thresh ||
> > > > > > +			txq->nb_tx_free > txq->tx_free_thresh)
> > > > > > +		return 0;
> > > > > > +
> > > > > > +	/* check DD bits on threshold descriptor */
> > > > > > +	if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
> > > > > > +
> 	rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
> > > > > > +
> > > > > 	rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
> > > > > > +		return 0;
> > > > > > +
> > > > > > +	n = txq->tx_rs_thresh;
> > > > > > +
> > > > > > +	/* first buffer to free from S/W ring is at index
> > > > > > +	 * tx_next_dd - (tx_rs_thresh-1)
> > > > > > +	 */
> > > > > > +	txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
> > > > > > +	rxep = rxq_rearm_data->rx_sw_ring;
> > > > > > +	rxep += *rxq_rearm_data->rearm_start;
> > > > > > +
> > > > > > +	if (txq->offloads &
> RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
> > > > > > +		/* directly put mbufs from Tx to Rx */
> > > > > > +		for (i = 0; i < n; i++, rxep++, txep++)
> > > > > > +			*rxep = txep[0].mbuf;
> > > > > > +	} else {
> > > > > > +		for (i = 0; i < n; i++, rxep++) {
> > > > > > +			m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
> > >
> > > One thing I forgot to ask:
> > > What would happen if this mbuf belongs to different mempool (not one
> > > that we specify at rx_queue_setup())?
> > > Do we need to check it here?
> > > Or would it be upper layer constraint?
> > > Or...?
> > >
> >
> > First, 'different mempool' is valid for no FAST_FREE path in tx_free_buffers.
> >
> > If buffers belong to different mempool, we can have an example here:
> > Buffer 1 from mempool 1, its recycle path is:
> > ----------------------------------------------------------------------
> > ------------------- 1. queue_setup: rearm from mempool 1 into Rx
> > sw-ring 2. rte_eth_Rx_burst: used by user app (Rx) 3.
> > rte_eth_Tx_burst: mount on Tx sw-ring 4. rte_eth_direct_rearm: free
> > into Rx sw-ring:
> >            or
> >     tx_free_buffers: free into mempool 1 (no fast_free path)
> > ----------------------------------------------------------------------
> > -------------------
> >
> > Buffer 2 from mempool 2, its recycle path is:
> > ----------------------------------------------------------------------
> > ------------------- 1. queue_setup: rearm from mempool 2 into Rx
> > sw-ring 2. rte_eth_Rx_burst: used by user app (Rx) 3.
> > rte_eth_Tx_burst: mount on Tx sw-ring 4. rte_eth_direct_rearm: free
> > into Rx sw-ring
> >            or
> >     tx_free_buffers: free into mempool 2 (no fast_free_path)
> > ----------------------------------------------------------------------
> > -------------------
> >
> > Thus, buffers from Tx different mempools are the same for Rx. The
> > difference point is that they will be freed into different mempool if the
> thread  uses generic free buffers.
> > I think this cannot affect direct-rearm mode, and we do not need to check
> this.
> 
> I understand that it should work even with multiple mempools.
> What I am trying to say - user may not want to use mbufs from particular
> mempool for RX (while it is still ok to use it for TX).
> Let say user can have a separate mempool with small data-buffers (less then
> normal MTU) to send some 'special' paclets, or even use this memppol with
> small buffers for zero-copy updating of packet L2/L3 headers, etc.
> Or it could be some 'special' user provided mempool.
> That's why I wonder should we allow only mbufs from mempool that is
> assigned to that RX queue.

Sorry for my misleading. If I understand correctly this time, you means a special
mempool. Maybe its buffer size is very small and this Tx buffer is generated from control plane.

However, if we recycle this Tx buffer into Rx buffer ring, there maybe some error due to its
size is so small.

Thus we can only allow general buffers which is valid for Rx buffer ring. Furthermore, this should be
user's  responsibility to ensure the Tx recycling buffers should be valid. If we check this in the data plane,
it will cost a lot of CPU cycles. At last, what we can do is to add constraint in the notes to remind users.
> 
> >
> > > > > > +			if (m != NULL) {
> > > > > > +				*rxep = m;
> > > > > > +				nb_rearm++;
> > > > > > +			}
> > > > > > +		}
> > > > > > +		n = nb_rearm;
> > > > > > +	}
> > > > > > +
> > > > > > +	/* update counters for Tx */
> > > > > > +	txq->nb_tx_free = (uint16_t)(txq->nb_tx_free + txq-
> >tx_rs_thresh);
> > > > > > +	txq->tx_next_dd = (uint16_t)(txq->tx_next_dd + txq-
> >tx_rs_thresh);
> > > > > > +	if (txq->tx_next_dd >= txq->nb_tx_desc)
> > > > > > +		txq->tx_next_dd = (uint16_t)(txq->tx_rs_thresh - 1);
> > > > > > +
> > > > > > +	return n;
> > > > > > +}
> > > > > > +


More information about the dev mailing list