[PATCH 21.11 2/2] net/mlx5: fix port event cleaning order

Michael Baum michaelba at nvidia.com
Thu Nov 24 08:53:20 CET 2022


[ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]

The shared IB device (sh) has per port data with filed for interrupt
handler port_id. It used by shared interrupt handler to find the
corresponding rte_eth device by IB port index.
If value is equal or greater RTE_MAX_ETHPORTS it means there is no
subhandler installed for specified IB port index.

When a few ports are created under same sh, the sh is created with the
first port and the interrupt handler port_id is initialized to
RTE_MAX_ETHPORTS for each port.
In port creation, the interrupt handler port_id is updated with the
correct value. Since this updating, the mlx5_dev_interrupt_nl_cb
function uses this port and its priv structure.
However, when the ports are closed, this filed isn't updated and the
interrupt handler continue working until it is uninstalled in SH
destruction.
If mlx5_dev_interrupt_nl_cb is called between port closing and SH
destruction, it uses invalid port causing a crash.

This patch adds interrupt handler port_id updating to the close function
and add memory barrier to make sure it is done before priv reset.

Fixes: 3a5e1aaee4cd ("net/mlx5: fix initial link status detection")

Signed-off-by: Michael Baum <michaelba at nvidia.com>
Acked-by: Matan Azrad <matan at nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 3 +++
 drivers/net/mlx5/mlx5.c          | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 4197ec424e..2fee13d50f 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1802,6 +1802,9 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	return eth_dev;
 error:
 	if (priv) {
+		priv->sh->port[priv->dev_port - 1].nl_ih_port_id =
+							       RTE_MAX_ETHPORTS;
+		rte_io_wmb();
 		if (priv->mreg_cp_tbl)
 			mlx5_hlist_destroy(priv->mreg_cp_tbl);
 		if (priv->sh)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index f79e71fe1a..c5f2db1e4a 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1656,6 +1656,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 			dev->data->port_id);
 	if (priv->hrxqs)
 		mlx5_list_destroy(priv->hrxqs);
+	priv->sh->port[priv->dev_port - 1].nl_ih_port_id = RTE_MAX_ETHPORTS;
+	/*
+	 * The interrupt handler port id must be reset before priv is reset
+	 * since 'mlx5_dev_interrupt_nl_cb' uses priv.
+	 */
+	rte_io_wmb();
 	/*
 	 * Free the shared context in last turn, because the cleanup
 	 * routines above may use some shared fields, like
-- 
2.25.1



More information about the stable mailing list