[dpdk-dev,v2] net/i40e: fix Tx fn selection when using new ethdev offloads
Checks
Commit Message
The Tx function selection code in the driver only used the older txq
flags values to check whether the scalar or vector functions should be
used. This caused performance regressions with testpmd io-fwd as the
scalar path rather than the vector one was being used in the default
case. Fix this by changing the code to take account of new offloads and
deleting the defines used for the old ones.
Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
drivers/net/i40e/i40e_rxtx.c | 39 ++++++++++++++++++---------------------
1 file changed, 18 insertions(+), 21 deletions(-)
Comments
On Tue, May 01, 2018 at 03:13:54PM +0100, Bruce Richardson wrote:
> The Tx function selection code in the driver only used the older txq
> flags values to check whether the scalar or vector functions should be
> used. This caused performance regressions with testpmd io-fwd as the
> scalar path rather than the vector one was being used in the default
> case. Fix this by changing the code to take account of new offloads and
> deleting the defines used for the old ones.
>
> Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
Apologies: forgot to add:
v2: eliminate mask for offload flags, and use vector path only if
offloads == 0
On 5/1/2018 3:16 PM, Bruce Richardson wrote:
> On Tue, May 01, 2018 at 03:13:54PM +0100, Bruce Richardson wrote:
>> The Tx function selection code in the driver only used the older txq
>> flags values to check whether the scalar or vector functions should be
>> used. This caused performance regressions with testpmd io-fwd as the
>> scalar path rather than the vector one was being used in the default
>> case. Fix this by changing the code to take account of new offloads and
>> deleting the defines used for the old ones.
>>
>> Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
>>
>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>> ---
> Apologies: forgot to add:
>
> v2: eliminate mask for offload flags, and use vector path only if
> offloads == 0
>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
On 5/1/2018 3:37 PM, Ferruh Yigit wrote:
> On 5/1/2018 3:16 PM, Bruce Richardson wrote:
>> On Tue, May 01, 2018 at 03:13:54PM +0100, Bruce Richardson wrote:
>>> The Tx function selection code in the driver only used the older txq
>>> flags values to check whether the scalar or vector functions should be
>>> used. This caused performance regressions with testpmd io-fwd as the
>>> scalar path rather than the vector one was being used in the default
>>> case. Fix this by changing the code to take account of new offloads and
>>> deleting the defines used for the old ones.
>>>
>>> Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
>>>
>>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>>> ---
>> Apologies: forgot to add:
>>
>> v2: eliminate mask for offload flags, and use vector path only if
>> offloads == 0
>>
>
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Applied to dpdk-next-net/master, thanks.
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Tuesday, May 1, 2018 3:14 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>
> Subject: [PATCH v2] net/i40e: fix Tx fn selection when using new ethdev offloads
>
> The Tx function selection code in the driver only used the older txq
> flags values to check whether the scalar or vector functions should be
> used. This caused performance regressions with testpmd io-fwd as the
> scalar path rather than the vector one was being used in the default
> case. Fix this by changing the code to take account of new offloads and
> deleting the defines used for the old ones.
>
> Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> drivers/net/i40e/i40e_rxtx.c | 39 ++++++++++++++++++---------------------
> 1 file changed, 18 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> index ec1ce54ca..006f5b846 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -40,9 +40,6 @@
> /* Base address of the HW descriptor ring should be 128B aligned. */
> #define I40E_RING_BASE_ALIGN 128
>
> -#define I40E_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
> - ETH_TXQ_FLAGS_NOOFFLOADS)
> -
> #define I40E_TXD_CMD (I40E_TX_DESC_CMD_EOP | I40E_TX_DESC_CMD_RS)
>
> #ifdef RTE_LIBRTE_IEEE1588
> @@ -2108,11 +2105,9 @@ i40e_dev_tx_queue_setup_runtime(struct rte_eth_dev *dev,
> dev->data->nb_tx_queues)) {
> /**
> * If it is the first queue to setup,
> - * set all flags to default and call
> + * set all flags and call
> * i40e_set_tx_function.
> */
> - ad->tx_simple_allowed = true;
> - ad->tx_vec_allowed = true;
> i40e_set_tx_function_flag(dev, txq);
> i40e_set_tx_function(dev);
> return 0;
> @@ -2128,9 +2123,8 @@ i40e_dev_tx_queue_setup_runtime(struct rte_eth_dev *dev,
> }
> /* check simple tx conflict */
> if (ad->tx_simple_allowed) {
> - if (((txq->txq_flags & I40E_SIMPLE_FLAGS) !=
> - I40E_SIMPLE_FLAGS) ||
> - txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
> + if (txq->offloads != 0 ||
> + txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
> PMD_DRV_LOG(ERR, "No-simple tx is required.");
> return -EINVAL;
> }
> @@ -3080,18 +3074,21 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
> I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
>
> /* Use a simple Tx queue (no offloads, no multi segs) if possible */
> - if (((txq->txq_flags & I40E_SIMPLE_FLAGS) == I40E_SIMPLE_FLAGS)
> - && (txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST)) {
> - if (txq->tx_rs_thresh <= RTE_I40E_TX_MAX_FREE_BUF_SZ) {
> - PMD_INIT_LOG(DEBUG, "Vector tx"
> - " can be enabled on this txq.");
> -
> - } else {
> - ad->tx_vec_allowed = false;
> - }
> - } else {
> - ad->tx_simple_allowed = false;
> - }
> + ad->tx_simple_allowed = (txq->offloads == 0 &&
> + txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST);
Actually after another thought - who setup txq->offloads?
I did a quick scan, through i40e code and seems no one does.
So now it seems not possible to enable TX offloads at all.
Konstantin
BTW, seems like rxq->offloads are not properly initialised too.
> + ad->tx_vec_allowed = (ad->tx_simple_allowed &&
> + txq->tx_rs_thresh <= RTE_I40E_TX_MAX_FREE_BUF_SZ);
> +
> + if (ad->tx_vec_allowed)
> + PMD_INIT_LOG(DEBUG, "Vector Tx can be enabled on Tx queue %u.",
> + txq->queue_id);
> + else if (ad->tx_simple_allowed)
> + PMD_INIT_LOG(DEBUG, "Simple Tx can be enabled on Tx queue %u.",
> + txq->queue_id);
> + else
> + PMD_INIT_LOG(DEBUG,
> + "Neither simple nor vector Tx enabled on Tx queue %u\n",
> + txq->queue_id);
> }
>
> void __attribute__((cold))
> --
> 2.14.3
On Tue, May 01, 2018 at 06:52:18PM +0100, Ananyev, Konstantin wrote:
>
>
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Tuesday, May 1, 2018 3:14 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> > Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>
> > Subject: [PATCH v2] net/i40e: fix Tx fn selection when using new ethdev offloads
> >
> > The Tx function selection code in the driver only used the older txq
> > flags values to check whether the scalar or vector functions should be
> > used. This caused performance regressions with testpmd io-fwd as the
> > scalar path rather than the vector one was being used in the default
> > case. Fix this by changing the code to take account of new offloads and
> > deleting the defines used for the old ones.
> >
> > Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > ---
> > drivers/net/i40e/i40e_rxtx.c | 39 ++++++++++++++++++---------------------
> > 1 file changed, 18 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> > index ec1ce54ca..006f5b846 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -40,9 +40,6 @@
> > /* Base address of the HW descriptor ring should be 128B aligned. */
> > #define I40E_RING_BASE_ALIGN 128
> >
> > -#define I40E_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
> > - ETH_TXQ_FLAGS_NOOFFLOADS)
> > -
> > #define I40E_TXD_CMD (I40E_TX_DESC_CMD_EOP | I40E_TX_DESC_CMD_RS)
> >
> > #ifdef RTE_LIBRTE_IEEE1588
> > @@ -2108,11 +2105,9 @@ i40e_dev_tx_queue_setup_runtime(struct rte_eth_dev *dev,
> > dev->data->nb_tx_queues)) {
> > /**
> > * If it is the first queue to setup,
> > - * set all flags to default and call
> > + * set all flags and call
> > * i40e_set_tx_function.
> > */
> > - ad->tx_simple_allowed = true;
> > - ad->tx_vec_allowed = true;
> > i40e_set_tx_function_flag(dev, txq);
> > i40e_set_tx_function(dev);
> > return 0;
> > @@ -2128,9 +2123,8 @@ i40e_dev_tx_queue_setup_runtime(struct rte_eth_dev *dev,
> > }
> > /* check simple tx conflict */
> > if (ad->tx_simple_allowed) {
> > - if (((txq->txq_flags & I40E_SIMPLE_FLAGS) !=
> > - I40E_SIMPLE_FLAGS) ||
> > - txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
> > + if (txq->offloads != 0 ||
> > + txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
> > PMD_DRV_LOG(ERR, "No-simple tx is required.");
> > return -EINVAL;
> > }
> > @@ -3080,18 +3074,21 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
> > I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> >
> > /* Use a simple Tx queue (no offloads, no multi segs) if possible */
> > - if (((txq->txq_flags & I40E_SIMPLE_FLAGS) == I40E_SIMPLE_FLAGS)
> > - && (txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST)) {
> > - if (txq->tx_rs_thresh <= RTE_I40E_TX_MAX_FREE_BUF_SZ) {
> > - PMD_INIT_LOG(DEBUG, "Vector tx"
> > - " can be enabled on this txq.");
> > -
> > - } else {
> > - ad->tx_vec_allowed = false;
> > - }
> > - } else {
> > - ad->tx_simple_allowed = false;
> > - }
> > + ad->tx_simple_allowed = (txq->offloads == 0 &&
> > + txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST);
>
> Actually after another thought - who setup txq->offloads?
> I did a quick scan, through i40e code and seems no one does.
> So now it seems not possible to enable TX offloads at all.
> Konstantin
>
> BTW, seems like rxq->offloads are not properly initialised too.
>
The offloads value should come from the app, no?
> -----Original Message-----
> From: Richardson, Bruce
> Sent: Wednesday, May 2, 2018 4:25 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>;
> dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [PATCH v2] net/i40e: fix Tx fn selection when using new ethdev
> offloads
>
> On Tue, May 01, 2018 at 06:52:18PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Richardson, Bruce
> > > Sent: Tuesday, May 1, 2018 3:14 PM
> > > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z
> > > <qi.z.zhang@intel.com>
> > > Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; Ananyev,
> > > Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce
> > > <bruce.richardson@intel.com>
> > > Subject: [PATCH v2] net/i40e: fix Tx fn selection when using new
> > > ethdev offloads
> > >
> > > The Tx function selection code in the driver only used the older txq
> > > flags values to check whether the scalar or vector functions should
> > > be used. This caused performance regressions with testpmd io-fwd as
> > > the scalar path rather than the vector one was being used in the
> > > default case. Fix this by changing the code to take account of new
> > > offloads and deleting the defines used for the old ones.
> > >
> > > Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
> > >
> > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> > > ---
> > > drivers/net/i40e/i40e_rxtx.c | 39
> > > ++++++++++++++++++---------------------
> > > 1 file changed, 18 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > > b/drivers/net/i40e/i40e_rxtx.c index ec1ce54ca..006f5b846 100644
> > > --- a/drivers/net/i40e/i40e_rxtx.c
> > > +++ b/drivers/net/i40e/i40e_rxtx.c
> > > @@ -40,9 +40,6 @@
> > > /* Base address of the HW descriptor ring should be 128B aligned. */
> > > #define I40E_RING_BASE_ALIGN 128
> > >
> > > -#define I40E_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS
> | \
> > > - ETH_TXQ_FLAGS_NOOFFLOADS)
> > > -
> > > #define I40E_TXD_CMD (I40E_TX_DESC_CMD_EOP |
> I40E_TX_DESC_CMD_RS)
> > >
> > > #ifdef RTE_LIBRTE_IEEE1588
> > > @@ -2108,11 +2105,9 @@ i40e_dev_tx_queue_setup_runtime(struct
> rte_eth_dev *dev,
> > > dev->data->nb_tx_queues)) {
> > > /**
> > > * If it is the first queue to setup,
> > > - * set all flags to default and call
> > > + * set all flags and call
> > > * i40e_set_tx_function.
> > > */
> > > - ad->tx_simple_allowed = true;
> > > - ad->tx_vec_allowed = true;
> > > i40e_set_tx_function_flag(dev, txq);
> > > i40e_set_tx_function(dev);
> > > return 0;
> > > @@ -2128,9 +2123,8 @@ i40e_dev_tx_queue_setup_runtime(struct
> rte_eth_dev *dev,
> > > }
> > > /* check simple tx conflict */
> > > if (ad->tx_simple_allowed) {
> > > - if (((txq->txq_flags & I40E_SIMPLE_FLAGS) !=
> > > - I40E_SIMPLE_FLAGS) ||
> > > - txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
> > > + if (txq->offloads != 0 ||
> > > + txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST)
> {
> > > PMD_DRV_LOG(ERR, "No-simple tx is required.");
> > > return -EINVAL;
> > > }
> > > @@ -3080,18 +3074,21 @@ i40e_set_tx_function_flag(struct
> rte_eth_dev *dev, struct i40e_tx_queue *txq)
> > > I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> > >
> > > /* Use a simple Tx queue (no offloads, no multi segs) if possible */
> > > - if (((txq->txq_flags & I40E_SIMPLE_FLAGS) == I40E_SIMPLE_FLAGS)
> > > - && (txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST)) {
> > > - if (txq->tx_rs_thresh <= RTE_I40E_TX_MAX_FREE_BUF_SZ) {
> > > - PMD_INIT_LOG(DEBUG, "Vector tx"
> > > - " can be enabled on this txq.");
> > > -
> > > - } else {
> > > - ad->tx_vec_allowed = false;
> > > - }
> > > - } else {
> > > - ad->tx_simple_allowed = false;
> > > - }
> > > + ad->tx_simple_allowed = (txq->offloads == 0 &&
> > > + txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST);
> >
> > Actually after another thought - who setup txq->offloads?
> > I did a quick scan, through i40e code and seems no one does.
> > So now it seems not possible to enable TX offloads at all.
> > Konstantin
> >
> > BTW, seems like rxq->offloads are not properly initialised too.
> >
> The offloads value should come from the app, no?
This should be a separate issue, I have submit the fix.
http://dpdk.org/dev/patchwork/patch/39229/
Regard
Qi
@@ -40,9 +40,6 @@
/* Base address of the HW descriptor ring should be 128B aligned. */
#define I40E_RING_BASE_ALIGN 128
-#define I40E_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
- ETH_TXQ_FLAGS_NOOFFLOADS)
-
#define I40E_TXD_CMD (I40E_TX_DESC_CMD_EOP | I40E_TX_DESC_CMD_RS)
#ifdef RTE_LIBRTE_IEEE1588
@@ -2108,11 +2105,9 @@ i40e_dev_tx_queue_setup_runtime(struct rte_eth_dev *dev,
dev->data->nb_tx_queues)) {
/**
* If it is the first queue to setup,
- * set all flags to default and call
+ * set all flags and call
* i40e_set_tx_function.
*/
- ad->tx_simple_allowed = true;
- ad->tx_vec_allowed = true;
i40e_set_tx_function_flag(dev, txq);
i40e_set_tx_function(dev);
return 0;
@@ -2128,9 +2123,8 @@ i40e_dev_tx_queue_setup_runtime(struct rte_eth_dev *dev,
}
/* check simple tx conflict */
if (ad->tx_simple_allowed) {
- if (((txq->txq_flags & I40E_SIMPLE_FLAGS) !=
- I40E_SIMPLE_FLAGS) ||
- txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
+ if (txq->offloads != 0 ||
+ txq->tx_rs_thresh < RTE_PMD_I40E_TX_MAX_BURST) {
PMD_DRV_LOG(ERR, "No-simple tx is required.");
return -EINVAL;
}
@@ -3080,18 +3074,21 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
/* Use a simple Tx queue (no offloads, no multi segs) if possible */
- if (((txq->txq_flags & I40E_SIMPLE_FLAGS) == I40E_SIMPLE_FLAGS)
- && (txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST)) {
- if (txq->tx_rs_thresh <= RTE_I40E_TX_MAX_FREE_BUF_SZ) {
- PMD_INIT_LOG(DEBUG, "Vector tx"
- " can be enabled on this txq.");
-
- } else {
- ad->tx_vec_allowed = false;
- }
- } else {
- ad->tx_simple_allowed = false;
- }
+ ad->tx_simple_allowed = (txq->offloads == 0 &&
+ txq->tx_rs_thresh >= RTE_PMD_I40E_TX_MAX_BURST);
+ ad->tx_vec_allowed = (ad->tx_simple_allowed &&
+ txq->tx_rs_thresh <= RTE_I40E_TX_MAX_FREE_BUF_SZ);
+
+ if (ad->tx_vec_allowed)
+ PMD_INIT_LOG(DEBUG, "Vector Tx can be enabled on Tx queue %u.",
+ txq->queue_id);
+ else if (ad->tx_simple_allowed)
+ PMD_INIT_LOG(DEBUG, "Simple Tx can be enabled on Tx queue %u.",
+ txq->queue_id);
+ else
+ PMD_INIT_LOG(DEBUG,
+ "Neither simple nor vector Tx enabled on Tx queue %u\n",
+ txq->queue_id);
}
void __attribute__((cold))