patch 'net/mlx5: fix port event cleaning order' has been queued to stable release 20.11.7
Michael Baum
michaelba at nvidia.com
Fri Nov 18 13:53:54 CET 2022
Hi Luca,
This patch causes another issue, so I have sent another patch to squash into.
The title of this patch is: " [PATCH 20.11] net/mlx5: fix invalid memory access in port closing"
Thanks,
Michael Baum
> -----Original Message-----
> From: luca.boccassi at gmail.com <luca.boccassi at gmail.com>
> Sent: Friday, 18 November 2022 1:09
> To: Michael Baum <michaelba at nvidia.com>
> Cc: Matan Azrad <matan at nvidia.com>; dpdk stable <stable at dpdk.org>
> Subject: patch 'net/mlx5: fix port event cleaning order' has been queued to
> stable release 20.11.7
>
> External email: Use caution opening links or attachments
>
>
> Hi,
>
> FYI, your patch has been queued to stable release 20.11.7
>
> Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
> It will be pushed if I get no objections before 11/19/22. So please shout if
> anyone has objections.
>
> Also note that after the patch there's a diff of the upstream commit vs the patch
> applied to the branch. This will indicate if there was any rebasing needed to
> apply to the stable branch. If there were code changes for rebasing
> (ie: not only metadata diffs), please double check that the rebase was correctly
> done.
>
> Queued patches are on a temporary branch at:
> https://github.com/kevintraynor/dpdk-stable
>
> This queued commit can be viewed at:
> https://github.com/kevintraynor/dpdk-
> stable/commit/79c37d65d2ff68ccd8dd2ad99340f54c80232918
>
> Thanks.
>
> Luca Boccassi
>
> ---
> From 79c37d65d2ff68ccd8dd2ad99340f54c80232918 Mon Sep 17 00:00:00 2001
> From: Michael Baum <michaelba at nvidia.com>
> Date: Thu, 10 Nov 2022 00:29:38 +0200
> Subject: [PATCH] net/mlx5: fix port event cleaning order
>
> [ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]
>
> The shared IB device (sh) has per port data with filed for interrupt handler
> port_id. It used by shared interrupt handler to find the corresponding rte_eth
> device by IB port index.
> If value is equal or greater RTE_MAX_ETHPORTS it means there is no subhandler
> installed for specified IB port index.
>
> When a few ports are created under same sh, the sh is created with the first port
> and the interrupt handler port_id is initialized to RTE_MAX_ETHPORTS for each
> port.
> In port creation, the interrupt handler port_id is updated with the correct value.
> Since this updating, the mlx5_dev_interrupt_nl_cb function uses this port and its
> priv structure.
> However, when the ports are closed, this filed isn't updated and the interrupt
> handler continue working until it is uninstalled in SH destruction.
> If mlx5_dev_interrupt_nl_cb is called between port closing and SH destruction, it
> uses invalid port causing a crash.
>
> This patch adds interrupt handler port_id updating to the close function and add
> memory barrier to make sure it is done before priv reset.
>
> Fixes: 655c3c26c11e ("net/mlx5: fix initial link status detection")
>
> Signed-off-by: Michael Baum <michaelba at nvidia.com>
> Acked-by: Matan Azrad <matan at nvidia.com>
> ---
> drivers/net/mlx5/linux/mlx5_os.c | 3 +++
> drivers/net/mlx5/mlx5.c | 6 ++++++
> 2 files changed, 9 insertions(+)
>
> diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
> index af19b54b7e..e79b1a275c 100644
> --- a/drivers/net/mlx5/linux/mlx5_os.c
> +++ b/drivers/net/mlx5/linux/mlx5_os.c
> @@ -1640,6 +1640,9 @@ err_secondary:
> return eth_dev;
> error:
> if (priv) {
> + priv->sh->port[priv->dev_port - 1].nl_ih_port_id =
> + RTE_MAX_ETHPORTS;
> + rte_io_wmb();
> if (priv->mreg_cp_tbl)
> mlx5_hlist_destroy(priv->mreg_cp_tbl);
> if (priv->sh)
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> 90985479de..22d3ecace2 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -1453,6 +1453,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> if (!c)
> claim_zero(rte_eth_switch_domain_free(priv->domain_id));
> }
> + priv->sh->port[priv->dev_port - 1].nl_ih_port_id = RTE_MAX_ETHPORTS;
> + /*
> + * The interrupt handler port id must be reset before priv is reset
> + * since 'mlx5_dev_interrupt_nl_cb' uses priv.
> + */
> + rte_io_wmb();
> memset(priv, 0, sizeof(*priv));
> priv->domain_id = RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID;
> /*
> --
> 2.34.1
>
> ---
> Diff of the applied patch vs upstream commit (please double-check if non-
> empty:
> ---
> --- - 2022-11-17 23:07:56.193400526 +0000
> +++ 0016-net-mlx5-fix-port-event-cleaning-order.patch 2022-11-17
> 23:07:55.492330367 +0000
> @@ -1 +1 @@
> -From 13c5c093905c09bb6207ee1c6a4f05d39f8badcd Mon Sep 17 00:00:00
> 2001
> +From 79c37d65d2ff68ccd8dd2ad99340f54c80232918 Mon Sep 17 00:00:00
> 2001
> @@ -5,0 +6,2 @@
> +[ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]
> +
> @@ -28 +29,0 @@
> -Cc: stable at dpdk.org
> @@ -38 +39 @@
> -index 2b6741396d..a71474c90a 100644
> +index af19b54b7e..e79b1a275c 100644
> @@ -41 +42 @@
> -@@ -1676,6 +1676,9 @@ err_secondary:
> +@@ -1640,6 +1640,9 @@ err_secondary:
> @@ -48,3 +49,3 @@
> - #ifdef HAVE_MLX5_HWS_SUPPORT
> - if (eth_dev &&
> - priv->sh &&
> + if (priv->mreg_cp_tbl)
> + mlx5_hlist_destroy(priv->mreg_cp_tbl);
> + if (priv->sh)
> @@ -52 +53 @@
> -index 1cf6df6049..95b0151fbc 100644
> +index 90985479de..22d3ecace2 100644
> @@ -55 +56 @@
> -@@ -2137,6 +2137,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> +@@ -1453,6 +1453,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
More information about the stable
mailing list