[dpdk-dev,v2,1/5] net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
Checks
Commit Message
To add a wrapper function to remove the limit of tx burst size.
The patch makes fm10k vec function an best effort to transmit
pkts in the consistent behavior like fm10k_xmit_pkts does that.
Cc: Jing Chen <jing.d.chen@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
---
drivers/net/fm10k/fm10k.h | 4 ++--
drivers/net/fm10k/fm10k_ethdev.c | 28 +++++++++++++++++++++++++---
drivers/net/fm10k/fm10k_rxtx_vec.c | 4 ++--
3 files changed, 29 insertions(+), 7 deletions(-)
Comments
The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
transmit output packets on the output queue for DPDK applications as
follows.
static inline uint16_t
rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
Note: The fourth parameter nb_pkts: The number of packets to transmit.
The rte_eth_tx_burst() function returns the number of packets it actually
sent. Most of PMD drivers can support the policy "send as many packets to
transmit as possible" at the PMD level. but the few of PMDs have some sort
of artificial limits for the pkts sent successfully. For example, VHOST tx
burst size is limited to 32 packets. Some rx_burst functions have the
similar problem. The main benefit is consistent batching behavior for user
to simplify their logic and avoid misusage at the application level, there
is unified rte_eth_tx/rx_burst interface already, there is no reason for
inconsistent behaviors.
This patchset fixes it via adding wrapper function at the PMD level.
Changes in V3:
1. Updated release_17_05 in patch 5/5
2. Rebase on top of next net tree. i40e_rxtx_vec_altivec.c is updated in
patch 2/5.
3. fix one checkpatch issue in 2/5.
Changes in V2:
1. rename ixgbe, i40e and fm10k vec function XXX_xmit_pkts_vec to new name
XXX_xmit_fixed_burst_vec, new wrapper functions use original name
XXX_xmit_pkts_vec according to Bruce's suggestion.
2. simplify the code to avoid the if or if/else.
Zhiyong Yang (5):
net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
net/i40e: remove limit of i40e_xmit_pkts_vec burst size
net/ixgbe: remove limit of ixgbe_xmit_pkts_vec burst size
net/vhost: remove limit of vhost TX burst size
net/vhost: remove limit of vhost RX burst size
doc/guides/rel_notes/release_17_05.rst | 4 ++++
drivers/net/fm10k/fm10k.h | 4 ++--
drivers/net/fm10k/fm10k_ethdev.c | 28 +++++++++++++++++++++++---
drivers/net/fm10k/fm10k_rxtx_vec.c | 4 ++--
drivers/net/i40e/i40e_rxtx.c | 28 +++++++++++++++++++++++---
drivers/net/i40e/i40e_rxtx.h | 4 ++--
drivers/net/i40e/i40e_rxtx_vec_altivec.c | 4 ++--
drivers/net/i40e/i40e_rxtx_vec_neon.c | 4 ++--
drivers/net/i40e/i40e_rxtx_vec_sse.c | 4 ++--
drivers/net/ixgbe/ixgbe_rxtx.c | 29 +++++++++++++++++++++++++++
drivers/net/ixgbe/ixgbe_rxtx.h | 4 ++--
drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 4 ++--
drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c | 4 ++--
drivers/net/vhost/rte_eth_vhost.c | 34 ++++++++++++++++++++++++++++----
14 files changed, 131 insertions(+), 28 deletions(-)
On 3/29/2017 8:16 AM, Zhiyong Yang wrote:
> The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
> transmit output packets on the output queue for DPDK applications as
> follows.
>
> static inline uint16_t
> rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
> struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
>
> Note: The fourth parameter nb_pkts: The number of packets to transmit.
>
> The rte_eth_tx_burst() function returns the number of packets it actually
> sent. Most of PMD drivers can support the policy "send as many packets to
> transmit as possible" at the PMD level. but the few of PMDs have some sort
> of artificial limits for the pkts sent successfully. For example, VHOST tx
> burst size is limited to 32 packets. Some rx_burst functions have the
> similar problem. The main benefit is consistent batching behavior for user
> to simplify their logic and avoid misusage at the application level, there
> is unified rte_eth_tx/rx_burst interface already, there is no reason for
> inconsistent behaviors.
> This patchset fixes it via adding wrapper function at the PMD level.
>
> Changes in V3:
>
> 1. Updated release_17_05 in patch 5/5
> 2. Rebase on top of next net tree. i40e_rxtx_vec_altivec.c is updated in
> patch 2/5.
> 3. fix one checkpatch issue in 2/5.
>
> Changes in V2:
> 1. rename ixgbe, i40e and fm10k vec function XXX_xmit_pkts_vec to new name
> XXX_xmit_fixed_burst_vec, new wrapper functions use original name
> XXX_xmit_pkts_vec according to Bruce's suggestion.
> 2. simplify the code to avoid the if or if/else.
>
> Zhiyong Yang (5):
> net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
> net/i40e: remove limit of i40e_xmit_pkts_vec burst size
> net/ixgbe: remove limit of ixgbe_xmit_pkts_vec burst size
> net/vhost: remove limit of vhost TX burst size
> net/vhost: remove limit of vhost RX burst size
Series applied to dpdk-next-net/master, thanks.
(doc patch exported into separate patch)
This is the PMD update on fast path, effected PMDs, can you please
confirm the performance after test?
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> Sent: Thursday, March 30, 2017 8:55 PM
> To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 0/5] consistent PMD batching behaviour
>
> On 3/29/2017 8:16 AM, Zhiyong Yang wrote:
> > The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
> > transmit output packets on the output queue for DPDK applications as
> > follows.
> >
> > static inline uint16_t
> > rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
> > struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
> >
> > Note: The fourth parameter nb_pkts: The number of packets to transmit.
> >
> > The rte_eth_tx_burst() function returns the number of packets it actually
> > sent. Most of PMD drivers can support the policy "send as many packets to
> > transmit as possible" at the PMD level. but the few of PMDs have some
> sort
> > of artificial limits for the pkts sent successfully. For example, VHOST tx
> > burst size is limited to 32 packets. Some rx_burst functions have the
> > similar problem. The main benefit is consistent batching behavior for user
> > to simplify their logic and avoid misusage at the application level, there
> > is unified rte_eth_tx/rx_burst interface already, there is no reason for
> > inconsistent behaviors.
> > This patchset fixes it via adding wrapper function at the PMD level.
> >
> > Changes in V3:
> >
> > 1. Updated release_17_05 in patch 5/5
> > 2. Rebase on top of next net tree. i40e_rxtx_vec_altivec.c is updated in
> > patch 2/5.
> > 3. fix one checkpatch issue in 2/5.
> >
> > Changes in V2:
> > 1. rename ixgbe, i40e and fm10k vec function XXX_xmit_pkts_vec to new
> name
> > XXX_xmit_fixed_burst_vec, new wrapper functions use original name
> > XXX_xmit_pkts_vec according to Bruce's suggestion.
> > 2. simplify the code to avoid the if or if/else.
> >
> > Zhiyong Yang (5):
> > net/fm10k: remove limit of fm10k_xmit_pkts_vec burst size
> > net/i40e: remove limit of i40e_xmit_pkts_vec burst size
> > net/ixgbe: remove limit of ixgbe_xmit_pkts_vec burst size
> > net/vhost: remove limit of vhost TX burst size
> > net/vhost: remove limit of vhost RX burst size
>
> Series applied to dpdk-next-net/master, thanks.
>
> (doc patch exported into separate patch)
>
> This is the PMD update on fast path, effected PMDs, can you please
> confirm the performance after test?
Hi,
I have compare the vhost PVP performance with and without Zhiyong's
Patch. Almost no performance drop
Mergeable path: -0.2%
Normal Path: -0.73%
Vector Path : -0.55%
Test bench:
Ubutnu16.04
Kernal: 4.4.0
gcc : 5.4.0
BRs
Lei
@@ -368,8 +368,8 @@ void fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue *rxq);
uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
uint16_t);
-uint16_t fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
- uint16_t nb_pkts);
+uint16_t fm10k_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts);
void fm10k_txq_vec_setup(struct fm10k_tx_queue *txq);
int fm10k_tx_vec_condition_check(struct fm10k_tx_queue *txq);
@@ -197,9 +197,9 @@ fm10k_tx_vec_condition_check(__rte_unused struct fm10k_tx_queue *txq)
}
uint16_t __attribute__((weak))
-fm10k_xmit_pkts_vec(__rte_unused void *tx_queue,
- __rte_unused struct rte_mbuf **tx_pkts,
- __rte_unused uint16_t nb_pkts)
+fm10k_xmit_fixed_burst_vec(__rte_unused void *tx_queue,
+ __rte_unused struct rte_mbuf **tx_pkts,
+ __rte_unused uint16_t nb_pkts)
{
return 0;
}
@@ -2741,6 +2741,28 @@ fm10k_check_ftag(struct rte_devargs *devargs)
return 1;
}
+static uint16_t
+fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts)
+{
+ uint16_t nb_tx = 0;
+ struct fm10k_tx_queue *txq = (struct fm10k_tx_queue *)tx_queue;
+
+ while (nb_pkts) {
+ uint16_t ret, num;
+
+ num = (uint16_t)RTE_MIN(nb_pkts, txq->rs_thresh);
+ ret = fm10k_xmit_fixed_burst_vec(tx_queue, &tx_pkts[nb_tx],
+ num);
+ nb_tx += ret;
+ nb_pkts -= ret;
+ if (ret < num)
+ break;
+ }
+
+ return nb_tx;
+}
+
static void __attribute__((cold))
fm10k_set_tx_function(struct rte_eth_dev *dev)
{
@@ -800,8 +800,8 @@ tx_backlog_entry(struct rte_mbuf **txep,
}
uint16_t
-fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
- uint16_t nb_pkts)
+fm10k_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts)
{
struct fm10k_tx_queue *txq = (struct fm10k_tx_queue *)tx_queue;
volatile struct fm10k_tx_desc *txdp;