patch 'common/mlx5: fix shared mempool subscription' has been queued to stable release 21.11.3

Kevin Traynor ktraynor at redhat.com
Wed Nov 23 19:03:26 CET 2022


Hi,

FYI, your patch has been queued to stable release 21.11.3

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/28/22. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://github.com/kevintraynor/dpdk-stable

This queued commit can be viewed at:
https://github.com/kevintraynor/dpdk-stable/commit/bbddde24b26ec976c7249f5c7dc81eb4fd567116

Thanks.

Kevin

---
>From bbddde24b26ec976c7249f5c7dc81eb4fd567116 Mon Sep 17 00:00:00 2001
From: Gregory Etelson <getelson at nvidia.com>
Date: Thu, 3 Nov 2022 12:44:27 +0200
Subject: [PATCH] common/mlx5: fix shared mempool subscription

[ upstream commit aeca11f82a2f13c19e320dd337f4aa9627545c99 ]

MLX5 PMD counted each mempool subscribe invocation. The PMD expected
that the mempool subscription will be deleted after the mempool
counter dropped to 0. However, current PMD design unsubscribes mempool
callbacks only once.
As the result, the PMD destroyed mlx5_common_device but kept
shared RX subscription callback. EAL tried to activate that callback
and crashed.

The patch removes mempool subscriptions counter.
The PMD registers mempool subscription once only. An attempt
to register existing subscription returns EEXIST.
Also, the PMD expects to remove subscription when mempool unsubscribe
was activated.

Fixes: 8ad97e4b3215 ("common/mlx5: fix multi-process mempool registration")

Signed-off-by: Gregory Etelson <getelson at nvidia.com>
Acked-by: Matan Azrad <matan at nvidia.com>
---
 drivers/common/mlx5/mlx5_common.c    | 22 +++++++++++-----------
 drivers/common/mlx5/mlx5_common_mr.c |  1 -
 drivers/common/mlx5/mlx5_common_mr.h |  1 -
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c
index 2634661fd3..f355b3d741 100644
--- a/drivers/common/mlx5/mlx5_common.c
+++ b/drivers/common/mlx5/mlx5_common.c
@@ -410,4 +410,9 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp,
 }
 
+/**
+ * Primary and secondary processes share the `cdev` pointer.
+ * Callbacks addresses are local in each process.
+ * Therefore, each process can register private callbacks.
+ */
 int
 mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
@@ -421,12 +426,11 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
 	ret = rte_mempool_event_callback_register(mlx5_dev_mempool_event_cb,
 						  cdev);
-	if (ret != 0 && rte_errno != EEXIST)
-		goto exit;
-	__atomic_add_fetch(&cdev->mr_scache.mempool_cb_reg_n, 1,
-			   __ATOMIC_ACQUIRE);
 	/* Register mempools only once for this device. */
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+	if (ret == 0 && rte_eal_process_type() == RTE_PROC_PRIMARY) {
 		rte_mempool_walk(mlx5_dev_mempool_register_cb, cdev);
-	ret = 0;
+		goto exit;
+	}
+	if (ret != 0 && rte_errno == EEXIST)
+		ret = 0;
 exit:
 	rte_rwlock_write_unlock(&cdev->mr_scache.mprwlock);
@@ -437,13 +441,9 @@ static void
 mlx5_dev_mempool_unsubscribe(struct mlx5_common_device *cdev)
 {
-	uint32_t mempool_cb_reg_n;
 	int ret;
 
+	MLX5_ASSERT(cdev->dev != NULL);
 	if (!cdev->config.mr_mempool_reg_en)
 		return;
-	mempool_cb_reg_n = __atomic_sub_fetch(&cdev->mr_scache.mempool_cb_reg_n,
-					      1, __ATOMIC_RELEASE);
-	if (mempool_cb_reg_n > 0)
-		return;
 	/* Stop watching for mempool events and unregister all mempools. */
 	ret = rte_mempool_event_callback_unregister(mlx5_dev_mempool_event_cb,
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 6899ba8e1a..7f56e1f973 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -1140,5 +1140,4 @@ mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket)
 	rte_rwlock_init(&share_cache->rwlock);
 	rte_rwlock_init(&share_cache->mprwlock);
-	share_cache->mempool_cb_reg_n = 0;
 	/* Initialize B-tree and allocate memory for global MR cache table. */
 	return mlx5_mr_btree_init(&share_cache->cache,
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index f774ccbf33..13eb350980 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -82,5 +82,4 @@ struct mlx5_mr_share_cache {
 	rte_rwlock_t rwlock; /* MR cache Lock. */
 	rte_rwlock_t mprwlock; /* Mempool Registration Lock. */
-	uint32_t mempool_cb_reg_n; /* Mempool event callback registrants. */
 	struct mlx5_mr_btree cache; /* Global MR cache table. */
 	struct mlx5_mr_list mr_list; /* Registered MR list. */
-- 
2.38.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2022-11-23 09:55:57.413452366 +0000
+++ 0013-common-mlx5-fix-shared-mempool-subscription.patch	2022-11-23 09:55:57.020149183 +0000
@@ -1 +1 @@
-From aeca11f82a2f13c19e320dd337f4aa9627545c99 Mon Sep 17 00:00:00 2001
+From bbddde24b26ec976c7249f5c7dc81eb4fd567116 Mon Sep 17 00:00:00 2001
@@ -5,0 +6,2 @@
+[ upstream commit aeca11f82a2f13c19e320dd337f4aa9627545c99 ]
+
@@ -21 +22,0 @@
-Cc: stable at dpdk.org
@@ -32 +33 @@
-index bf22c0694d..0ad14a48c7 100644
+index 2634661fd3..f355b3d741 100644
@@ -35 +36 @@
-@@ -578,4 +578,9 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp,
+@@ -410,4 +410,9 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp,
@@ -45 +46 @@
-@@ -589,12 +594,11 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
+@@ -421,12 +426,11 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
@@ -63 +64 @@
-@@ -605,13 +609,9 @@ static void
+@@ -437,13 +441,9 @@ static void
@@ -79 +80 @@
-index 1d54102b54..0e1d2434ab 100644
+index 6899ba8e1a..7f56e1f973 100644
@@ -82 +83 @@
-@@ -1139,5 +1139,4 @@ mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket)
+@@ -1140,5 +1140,4 @@ mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket)



More information about the stable mailing list