[PATCH] common/mlx5: fix shared mempool subscription

Gregory Etelson getelson at nvidia.com
Thu Nov 3 11:44:27 CET 2022


MLX5 PMD counted each mempool subscribe invocation. The PMD expected
that the mempool subscription will be deleted after the mempool
counter dropped to 0. However, current PMD design unsubscribes mempool
callbacks only once.
As the result, the PMD destroyed mlx5_common_device but kept
shared RX subscription callback. EAL tried to activate that callback
and crashed.

The patch removes mempool subscriptions counter.
The PMD registers mempool subscription once only. An attempt
to register existing subscription returns EEXIST.
Also, the PMD expects to remove subscription when mempool unsubscribe
was activated.

Fixes: 8ad97e4b3215 ("common/mlx5: fix multi-process mempool registration")

Cc: stable at dpdk.org

Signed-off-by: Gregory Etelson <getelson at nvidia.com>
Acked-by: Matan Azrad <matan at nvidia.com>
---
 drivers/common/mlx5/mlx5_common.c    | 22 +++++++++++-----------
 drivers/common/mlx5/mlx5_common_mr.c |  1 -
 drivers/common/mlx5/mlx5_common_mr.h |  1 -
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c
index bf22c0694d..0ad14a48c7 100644
--- a/drivers/common/mlx5/mlx5_common.c
+++ b/drivers/common/mlx5/mlx5_common.c
@@ -577,6 +577,11 @@ mlx5_dev_mempool_event_cb(enum rte_mempool_event event, struct rte_mempool *mp,
 	}
 }
 
+/**
+ * Primary and secondary processes share the `cdev` pointer.
+ * Callbacks addresses are local in each process.
+ * Therefore, each process can register private callbacks.
+ */
 int
 mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
 {
@@ -588,14 +593,13 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
 	/* Callback for this device may be already registered. */
 	ret = rte_mempool_event_callback_register(mlx5_dev_mempool_event_cb,
 						  cdev);
-	if (ret != 0 && rte_errno != EEXIST)
-		goto exit;
-	__atomic_add_fetch(&cdev->mr_scache.mempool_cb_reg_n, 1,
-			   __ATOMIC_ACQUIRE);
 	/* Register mempools only once for this device. */
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+	if (ret == 0 && rte_eal_process_type() == RTE_PROC_PRIMARY) {
 		rte_mempool_walk(mlx5_dev_mempool_register_cb, cdev);
-	ret = 0;
+		goto exit;
+	}
+	if (ret != 0 && rte_errno == EEXIST)
+		ret = 0;
 exit:
 	rte_rwlock_write_unlock(&cdev->mr_scache.mprwlock);
 	return ret;
@@ -604,15 +608,11 @@ mlx5_dev_mempool_subscribe(struct mlx5_common_device *cdev)
 static void
 mlx5_dev_mempool_unsubscribe(struct mlx5_common_device *cdev)
 {
-	uint32_t mempool_cb_reg_n;
 	int ret;
 
+	MLX5_ASSERT(cdev->dev != NULL);
 	if (!cdev->config.mr_mempool_reg_en)
 		return;
-	mempool_cb_reg_n = __atomic_sub_fetch(&cdev->mr_scache.mempool_cb_reg_n,
-					      1, __ATOMIC_RELEASE);
-	if (mempool_cb_reg_n > 0)
-		return;
 	/* Stop watching for mempool events and unregister all mempools. */
 	ret = rte_mempool_event_callback_unregister(mlx5_dev_mempool_event_cb,
 						    cdev);
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 1d54102b54..0e1d2434ab 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -1138,7 +1138,6 @@ mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket)
 			      &share_cache->dereg_mr_cb);
 	rte_rwlock_init(&share_cache->rwlock);
 	rte_rwlock_init(&share_cache->mprwlock);
-	share_cache->mempool_cb_reg_n = 0;
 	/* Initialize B-tree and allocate memory for global MR cache table. */
 	return mlx5_mr_btree_init(&share_cache->cache,
 				  MLX5_MR_BTREE_CACHE_N * 2, socket);
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index f774ccbf33..13eb350980 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -81,7 +81,6 @@ struct mlx5_mr_share_cache {
 	uint32_t dev_gen; /* Generation number to flush local caches. */
 	rte_rwlock_t rwlock; /* MR cache Lock. */
 	rte_rwlock_t mprwlock; /* Mempool Registration Lock. */
-	uint32_t mempool_cb_reg_n; /* Mempool event callback registrants. */
 	struct mlx5_mr_btree cache; /* Global MR cache table. */
 	struct mlx5_mr_list mr_list; /* Registered MR list. */
 	struct mlx5_mr_list mr_free_list; /* Freed MR list. */
-- 
2.34.1



More information about the stable mailing list