patch 'net/mlx5: fix port event cleaning order' has been queued to stable release 20.11.7

Michael Baum michaelba at nvidia.com
Fri Nov 18 13:53:54 CET 2022


Hi Luca,

This patch causes another issue, so I have sent another patch to squash into.

The title of this patch is: " [PATCH 20.11] net/mlx5: fix invalid memory access in port closing"

Thanks,
Michael Baum

> -----Original Message-----
> From: luca.boccassi at gmail.com <luca.boccassi at gmail.com>
> Sent: Friday, 18 November 2022 1:09
> To: Michael Baum <michaelba at nvidia.com>
> Cc: Matan Azrad <matan at nvidia.com>; dpdk stable <stable at dpdk.org>
> Subject: patch 'net/mlx5: fix port event cleaning order' has been queued to
> stable release 20.11.7
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> 
> FYI, your patch has been queued to stable release 20.11.7
> 
> Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
> It will be pushed if I get no objections before 11/19/22. So please shout if
> anyone has objections.
> 
> Also note that after the patch there's a diff of the upstream commit vs the patch
> applied to the branch. This will indicate if there was any rebasing needed to
> apply to the stable branch. If there were code changes for rebasing
> (ie: not only metadata diffs), please double check that the rebase was correctly
> done.
> 
> Queued patches are on a temporary branch at:
> https://github.com/kevintraynor/dpdk-stable
> 
> This queued commit can be viewed at:
> https://github.com/kevintraynor/dpdk-
> stable/commit/79c37d65d2ff68ccd8dd2ad99340f54c80232918
> 
> Thanks.
> 
> Luca Boccassi
> 
> ---
> From 79c37d65d2ff68ccd8dd2ad99340f54c80232918 Mon Sep 17 00:00:00 2001
> From: Michael Baum <michaelba at nvidia.com>
> Date: Thu, 10 Nov 2022 00:29:38 +0200
> Subject: [PATCH] net/mlx5: fix port event cleaning order
> 
> [ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]
> 
> The shared IB device (sh) has per port data with filed for interrupt handler
> port_id. It used by shared interrupt handler to find the corresponding rte_eth
> device by IB port index.
> If value is equal or greater RTE_MAX_ETHPORTS it means there is no subhandler
> installed for specified IB port index.
> 
> When a few ports are created under same sh, the sh is created with the first port
> and the interrupt handler port_id is initialized to RTE_MAX_ETHPORTS for each
> port.
> In port creation, the interrupt handler port_id is updated with the correct value.
> Since this updating, the mlx5_dev_interrupt_nl_cb function uses this port and its
> priv structure.
> However, when the ports are closed, this filed isn't updated and the interrupt
> handler continue working until it is uninstalled in SH destruction.
> If mlx5_dev_interrupt_nl_cb is called between port closing and SH destruction, it
> uses invalid port causing a crash.
> 
> This patch adds interrupt handler port_id updating to the close function and add
> memory barrier to make sure it is done before priv reset.
> 
> Fixes: 655c3c26c11e ("net/mlx5: fix initial link status detection")
> 
> Signed-off-by: Michael Baum <michaelba at nvidia.com>
> Acked-by: Matan Azrad <matan at nvidia.com>
> ---
>  drivers/net/mlx5/linux/mlx5_os.c | 3 +++
>  drivers/net/mlx5/mlx5.c          | 6 ++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
> index af19b54b7e..e79b1a275c 100644
> --- a/drivers/net/mlx5/linux/mlx5_os.c
> +++ b/drivers/net/mlx5/linux/mlx5_os.c
> @@ -1640,6 +1640,9 @@ err_secondary:
>         return eth_dev;
>  error:
>         if (priv) {
> +               priv->sh->port[priv->dev_port - 1].nl_ih_port_id =
> +                                                              RTE_MAX_ETHPORTS;
> +               rte_io_wmb();
>                 if (priv->mreg_cp_tbl)
>                         mlx5_hlist_destroy(priv->mreg_cp_tbl);
>                 if (priv->sh)
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> 90985479de..22d3ecace2 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -1453,6 +1453,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
>                 if (!c)
>                         claim_zero(rte_eth_switch_domain_free(priv->domain_id));
>         }
> +       priv->sh->port[priv->dev_port - 1].nl_ih_port_id = RTE_MAX_ETHPORTS;
> +       /*
> +        * The interrupt handler port id must be reset before priv is reset
> +        * since 'mlx5_dev_interrupt_nl_cb' uses priv.
> +        */
> +       rte_io_wmb();
>         memset(priv, 0, sizeof(*priv));
>         priv->domain_id = RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID;
>         /*
> --
> 2.34.1
> 
> ---
>   Diff of the applied patch vs upstream commit (please double-check if non-
> empty:
> ---
> --- -   2022-11-17 23:07:56.193400526 +0000
> +++ 0016-net-mlx5-fix-port-event-cleaning-order.patch   2022-11-17
> 23:07:55.492330367 +0000
> @@ -1 +1 @@
> -From 13c5c093905c09bb6207ee1c6a4f05d39f8badcd Mon Sep 17 00:00:00
> 2001
> +From 79c37d65d2ff68ccd8dd2ad99340f54c80232918 Mon Sep 17 00:00:00
> 2001
> @@ -5,0 +6,2 @@
> +[ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]
> +
> @@ -28 +29,0 @@
> -Cc: stable at dpdk.org
> @@ -38 +39 @@
> -index 2b6741396d..a71474c90a 100644
> +index af19b54b7e..e79b1a275c 100644
> @@ -41 +42 @@
> -@@ -1676,6 +1676,9 @@ err_secondary:
> +@@ -1640,6 +1640,9 @@ err_secondary:
> @@ -48,3 +49,3 @@
> - #ifdef HAVE_MLX5_HWS_SUPPORT
> -               if (eth_dev &&
> -                   priv->sh &&
> +               if (priv->mreg_cp_tbl)
> +                       mlx5_hlist_destroy(priv->mreg_cp_tbl);
> +               if (priv->sh)
> @@ -52 +53 @@
> -index 1cf6df6049..95b0151fbc 100644
> +index 90985479de..22d3ecace2 100644
> @@ -55 +56 @@
> -@@ -2137,6 +2137,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> +@@ -1453,6 +1453,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)


More information about the stable mailing list