[dpdk-dev,v2,1/5] net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size

Message ID 1488539851-71009-2-git-send-email-zhiyong.yang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Yang, Zhiyong March 3, 2017, 11:17 a.m. UTC
  To add a wrapper function to remove the limit of tx burst size.
The patch makes fm10k vec function an best effort to transmit
pkts in the consistent behavior like fm10k_xmit_pkts does that.

Cc: Jing Chen <jing.d.chen@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
---
 drivers/net/fm10k/fm10k.h          |  4 ++--
 drivers/net/fm10k/fm10k_ethdev.c   | 28 +++++++++++++++++++++++++---
 drivers/net/fm10k/fm10k_rxtx_vec.c |  4 ++--
 3 files changed, 29 insertions(+), 7 deletions(-)
  

Comments

Yang, Zhiyong March 29, 2017, 7:16 a.m. UTC | #1
The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
transmit output packets on the output queue for DPDK applications as
follows.

static inline uint16_t
rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
                 struct rte_mbuf **tx_pkts, uint16_t nb_pkts);

Note: The fourth parameter nb_pkts: The number of packets to transmit.

The rte_eth_tx_burst() function returns the number of packets it actually
sent. Most of PMD drivers can support the policy "send as many packets to
transmit as possible" at the PMD level. but the few of PMDs have some sort
of artificial limits for the pkts sent successfully. For example, VHOST tx
burst size is limited to 32 packets. Some rx_burst functions have the
similar problem. The main benefit is consistent batching behavior for user
to simplify their logic and avoid misusage at the application level, there
is unified rte_eth_tx/rx_burst interface already, there is no reason for
inconsistent behaviors. 
This patchset fixes it via adding wrapper function at the PMD level.

Changes in V3:

1. Updated release_17_05 in patch 5/5
2. Rebase on top of next net tree. i40e_rxtx_vec_altivec.c is updated in
patch 2/5.
3. fix one checkpatch issue in 2/5. 

Changes in V2:
1. rename ixgbe, i40e and fm10k vec function XXX_xmit_pkts_vec to new name
XXX_xmit_fixed_burst_vec, new wrapper functions use original name
XXX_xmit_pkts_vec according to Bruce's suggestion.
2. simplify the code to avoid the if or if/else.

Zhiyong Yang (5):
  net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
  net/i40e: remove limit of i40e_xmit_pkts_vec burst size
  net/ixgbe: remove limit of ixgbe_xmit_pkts_vec burst size
  net/vhost: remove limit of vhost TX burst size
  net/vhost: remove limit of vhost RX burst size

 doc/guides/rel_notes/release_17_05.rst   |  4 ++++
 drivers/net/fm10k/fm10k.h                |  4 ++--
 drivers/net/fm10k/fm10k_ethdev.c         | 28 +++++++++++++++++++++++---
 drivers/net/fm10k/fm10k_rxtx_vec.c       |  4 ++--
 drivers/net/i40e/i40e_rxtx.c             | 28 +++++++++++++++++++++++---
 drivers/net/i40e/i40e_rxtx.h             |  4 ++--
 drivers/net/i40e/i40e_rxtx_vec_altivec.c |  4 ++--
 drivers/net/i40e/i40e_rxtx_vec_neon.c    |  4 ++--
 drivers/net/i40e/i40e_rxtx_vec_sse.c     |  4 ++--
 drivers/net/ixgbe/ixgbe_rxtx.c           | 29 +++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_rxtx.h           |  4 ++--
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c  |  4 ++--
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c   |  4 ++--
 drivers/net/vhost/rte_eth_vhost.c        | 34 ++++++++++++++++++++++++++++----
 14 files changed, 131 insertions(+), 28 deletions(-)
  
Ferruh Yigit March 30, 2017, 12:54 p.m. UTC | #2
On 3/29/2017 8:16 AM, Zhiyong Yang wrote:
> The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
> transmit output packets on the output queue for DPDK applications as
> follows.
> 
> static inline uint16_t
> rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
>                  struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
> 
> Note: The fourth parameter nb_pkts: The number of packets to transmit.
> 
> The rte_eth_tx_burst() function returns the number of packets it actually
> sent. Most of PMD drivers can support the policy "send as many packets to
> transmit as possible" at the PMD level. but the few of PMDs have some sort
> of artificial limits for the pkts sent successfully. For example, VHOST tx
> burst size is limited to 32 packets. Some rx_burst functions have the
> similar problem. The main benefit is consistent batching behavior for user
> to simplify their logic and avoid misusage at the application level, there
> is unified rte_eth_tx/rx_burst interface already, there is no reason for
> inconsistent behaviors. 
> This patchset fixes it via adding wrapper function at the PMD level.
> 
> Changes in V3:
> 
> 1. Updated release_17_05 in patch 5/5
> 2. Rebase on top of next net tree. i40e_rxtx_vec_altivec.c is updated in
> patch 2/5.
> 3. fix one checkpatch issue in 2/5. 
> 
> Changes in V2:
> 1. rename ixgbe, i40e and fm10k vec function XXX_xmit_pkts_vec to new name
> XXX_xmit_fixed_burst_vec, new wrapper functions use original name
> XXX_xmit_pkts_vec according to Bruce's suggestion.
> 2. simplify the code to avoid the if or if/else.
> 
> Zhiyong Yang (5):
>   net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
>   net/i40e: remove limit of i40e_xmit_pkts_vec burst size
>   net/ixgbe: remove limit of ixgbe_xmit_pkts_vec burst size
>   net/vhost: remove limit of vhost TX burst size
>   net/vhost: remove limit of vhost RX burst size

Series applied to dpdk-next-net/master, thanks.

(doc patch exported into separate patch)

This is the PMD update on fast path, effected PMDs, can you please
confirm the performance after test?
  
Yao, Lei A March 31, 2017, 7 a.m. UTC | #3
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> Sent: Thursday, March 30, 2017 8:55 PM
> To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 0/5] consistent PMD batching behaviour
> 
> On 3/29/2017 8:16 AM, Zhiyong Yang wrote:
> > The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
> > transmit output packets on the output queue for DPDK applications as
> > follows.
> >
> > static inline uint16_t
> > rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
> >                  struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
> >
> > Note: The fourth parameter nb_pkts: The number of packets to transmit.
> >
> > The rte_eth_tx_burst() function returns the number of packets it actually
> > sent. Most of PMD drivers can support the policy "send as many packets to
> > transmit as possible" at the PMD level. but the few of PMDs have some
> sort
> > of artificial limits for the pkts sent successfully. For example, VHOST tx
> > burst size is limited to 32 packets. Some rx_burst functions have the
> > similar problem. The main benefit is consistent batching behavior for user
> > to simplify their logic and avoid misusage at the application level, there
> > is unified rte_eth_tx/rx_burst interface already, there is no reason for
> > inconsistent behaviors.
> > This patchset fixes it via adding wrapper function at the PMD level.
> >
> > Changes in V3:
> >
> > 1. Updated release_17_05 in patch 5/5
> > 2. Rebase on top of next net tree. i40e_rxtx_vec_altivec.c is updated in
> > patch 2/5.
> > 3. fix one checkpatch issue in 2/5.
> >
> > Changes in V2:
> > 1. rename ixgbe, i40e and fm10k vec function XXX_xmit_pkts_vec to new
> name
> > XXX_xmit_fixed_burst_vec, new wrapper functions use original name
> > XXX_xmit_pkts_vec according to Bruce's suggestion.
> > 2. simplify the code to avoid the if or if/else.
> >
> > Zhiyong Yang (5):
> >   net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
> >   net/i40e: remove limit of i40e_xmit_pkts_vec burst size
> >   net/ixgbe: remove limit of ixgbe_xmit_pkts_vec burst size
> >   net/vhost: remove limit of vhost TX burst size
> >   net/vhost: remove limit of vhost RX burst size
> 
> Series applied to dpdk-next-net/master, thanks.
> 
> (doc patch exported into separate patch)
> 
> This is the PMD update on fast path, effected PMDs, can you please
> confirm the performance after test?
Hi, 

I have compare the vhost PVP performance with and without Zhiyong's 
Patch. Almost no performance drop
Mergeable path: -0.2%
Normal Path: -0.73%
Vector Path : -0.55%

Test bench:
Ubutnu16.04
Kernal:  4.4.0
gcc : 5.4.0

BRs
Lei
  

Patch

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index c6fed21..8e1a950 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -368,8 +368,8 @@  void fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue *rxq);
 uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
 uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
 					uint16_t);
-uint16_t fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
-		uint16_t nb_pkts);
+uint16_t fm10k_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+				    uint16_t nb_pkts);
 void fm10k_txq_vec_setup(struct fm10k_tx_queue *txq);
 int fm10k_tx_vec_condition_check(struct fm10k_tx_queue *txq);
 
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index c4fe746..dd4ea80 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -197,9 +197,9 @@  fm10k_tx_vec_condition_check(__rte_unused struct fm10k_tx_queue *txq)
 }
 
 uint16_t __attribute__((weak))
-fm10k_xmit_pkts_vec(__rte_unused void *tx_queue,
-		__rte_unused struct rte_mbuf **tx_pkts,
-		__rte_unused uint16_t nb_pkts)
+fm10k_xmit_fixed_burst_vec(__rte_unused void *tx_queue,
+			   __rte_unused struct rte_mbuf **tx_pkts,
+			   __rte_unused uint16_t nb_pkts)
 {
 	return 0;
 }
@@ -2741,6 +2741,28 @@  fm10k_check_ftag(struct rte_devargs *devargs)
 	return 1;
 }
 
+static uint16_t
+fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+		    uint16_t nb_pkts)
+{
+	uint16_t nb_tx = 0;
+	struct fm10k_tx_queue *txq = (struct fm10k_tx_queue *)tx_queue;
+
+	while (nb_pkts) {
+		uint16_t ret, num;
+
+		num = (uint16_t)RTE_MIN(nb_pkts, txq->rs_thresh);
+		ret = fm10k_xmit_fixed_burst_vec(tx_queue, &tx_pkts[nb_tx],
+						 num);
+		nb_tx += ret;
+		nb_pkts -= ret;
+		if (ret < num)
+			break;
+	}
+
+	return nb_tx;
+}
+
 static void __attribute__((cold))
 fm10k_set_tx_function(struct rte_eth_dev *dev)
 {
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 27f3e43..ab87206 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -800,8 +800,8 @@  tx_backlog_entry(struct rte_mbuf **txep,
 }
 
 uint16_t
-fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
-			uint16_t nb_pkts)
+fm10k_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+			   uint16_t nb_pkts)
 {
 	struct fm10k_tx_queue *txq = (struct fm10k_tx_queue *)tx_queue;
 	volatile struct fm10k_tx_desc *txdp;