patch 'net/mlx5: fix hairpin queue unbind' has been queued to stable release 20.11.10

luca.boccassi at gmail.com luca.boccassi at gmail.com
Wed Nov 15 12:45:11 CET 2023


Hi,

FYI, your patch has been queued to stable release 20.11.10

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/17/23. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://github.com/bluca/dpdk-stable

This queued commit can be viewed at:
https://github.com/bluca/dpdk-stable/commit/7243435622b41dd4a9fe1d0e59f62903886353ca

Thanks.

Luca Boccassi

---
>From 7243435622b41dd4a9fe1d0e59f62903886353ca Mon Sep 17 00:00:00 2001
From: Dariusz Sosnowski <dsosnowski at nvidia.com>
Date: Thu, 9 Nov 2023 20:01:09 +0200
Subject: [PATCH] net/mlx5: fix hairpin queue unbind

[ upstream commit ab2439f80bdf94e2382efe941cf827da6710b5d7 ]

Let's take an application with the following configuration:

- It uses 2 ports.
- Each port has 3 Rx queues and 3 Tx queues.
- On each port, Rx queues have a following purposes:
  - Rx queue 0 - SW queue,
  - Rx queue 1 - hairpin queue, bound to Tx queue on the same port,
  - Rx queue 2 - hairpin queue, bound to Tx queue on another port.
- On each port, Tx queues have a following purposes:
  - Tx queue 0 - SW queue,
  - Tx queue 1 - hairpin queue, bound to Rx queue on the same port,
  - Tx queue 2 - hairpin queue, bound to Rx queue on another port.
- Application configured all of the hairpin queues for manual binding.

After ports are configured and queues are set up,
if the application does the following API call sequence:

1. rte_eth_dev_start(port_id=0)
2. rte_eth_hairpin_bind(tx_port=0, rx_port=0)
3. rte_eth_hairpin_bind(tx_port=0, rx_port=1)

mlx5 PMD fails to modify SQ and logs this error:

  mlx5_common: mlx5_devx_cmds.c:2079: mlx5_devx_cmd_modify_sq():
    Failed to modify SQ using DevX

This error was caused by an incorrect unbind operation taken during
error handling inside call (3).

(3) fails, because port 1 (Rx side of the hairpin) was not started.
As a result of this failure, PMD goes into error handling, where all
previously bound hairpin queues are unbound.
This is incorrect, since this error handling procedure
in rte_eth_hairpin_bind() implementation assumes that
all hairpin queues are bound to the same rx_port, which is not the case.
The following sequence of function calls appears:

- rte_eth_hairpin_queue_peer_unbind(rx_port=**1**, rx_queue=1, 0),
- mlx5_hairpin_queue_peer_unbind(dev=**port 0**, tx_queue=1, 1).

Which violates the hairpin queue destroy flow, by unbinding Tx queue 1
on port 0, before unbinding Rx queue 1 on port 1.

This patch fixes that behavior, by filtering Tx queues on which error
handling is done to only affect:

- hairpin queues (it also reduces unnecessary debug log messages),
- hairpin queues connected to the rx_port which is currently processed.

Fixes: 37cd4501e873 ("net/mlx5: support two ports hairpin mode")

Signed-off-by: Dariusz Sosnowski <dsosnowski at nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
---
 drivers/net/mlx5/mlx5_trigger.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 93181d6c93..fb61094e45 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -820,6 +820,11 @@ error:
 		txq_ctrl = mlx5_txq_get(dev, i);
 		if (txq_ctrl == NULL)
 			continue;
+		if (txq_ctrl->type != MLX5_TXQ_TYPE_HAIRPIN ||
+		    txq_ctrl->hairpin_conf.peers[0].port != rx_port) {
+			mlx5_txq_release(dev, i);
+			continue;
+		}
 		rx_queue = txq_ctrl->hairpin_conf.peers[0].queue;
 		rte_eth_hairpin_queue_peer_unbind(rx_port, rx_queue, 0);
 		mlx5_hairpin_queue_peer_unbind(dev, i, 1);
-- 
2.39.2

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2023-11-15 11:44:14.122525380 +0000
+++ 0012-net-mlx5-fix-hairpin-queue-unbind.patch	2023-11-15 11:44:13.602388184 +0000
@@ -1 +1 @@
-From ab2439f80bdf94e2382efe941cf827da6710b5d7 Mon Sep 17 00:00:00 2001
+From 7243435622b41dd4a9fe1d0e59f62903886353ca Mon Sep 17 00:00:00 2001
@@ -5,0 +6,2 @@
+[ upstream commit ab2439f80bdf94e2382efe941cf827da6710b5d7 ]
+
@@ -56 +57,0 @@
-Cc: stable at dpdk.org
@@ -65 +66 @@
-index 7694140537..4b5becc10c 100644
+index 93181d6c93..fb61094e45 100644
@@ -68 +69 @@
-@@ -845,6 +845,11 @@ error:
+@@ -820,6 +820,11 @@ error:
@@ -72 +73 @@
-+		if (!txq_ctrl->is_hairpin ||
++		if (txq_ctrl->type != MLX5_TXQ_TYPE_HAIRPIN ||


More information about the stable mailing list