[dpdk-dev,v2] net/mlx4: workaround to verbs wrong error return

Message ID 1501700450-11847-1-git-send-email-matan@mellanox.com (mailing list archive)
State Accepted, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Matan Azrad Aug. 2, 2017, 7 p.m. UTC
  Current mlx4 OFED version has bug which returns error to
ibv destroy functions when the device was plugged out, in
spite of the resources were destroyed correctly.

Hence, failsafe PMD was aborted, only in debug mode, when
it tries to remove the device in plug-out process.

The workaround added option to replace all claim_zero
assertions with debugging messages, by the way, this option
affects non ibv destroy assertions.

DPDK 18.02 release should work with Mellanox OFED-4.2 which will
include the verbs fix to this bug, then, this patch can
be removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 config/common_base        | 1 +
 doc/guides/nics/mlx4.rst  | 8 ++++++++
 drivers/net/mlx4/Makefile | 4 ++++
 drivers/net/mlx4/mlx4.h   | 6 ++++++
 4 files changed, 19 insertions(+)

This v2 is not perfect but satisfy.
  

Comments

Thomas Monjalon Aug. 3, 2017, 9:04 p.m. UTC | #1
02/08/2017 21:00, Matan Azrad:
> Current mlx4 OFED version has bug which returns error to
> ibv destroy functions when the device was plugged out, in
> spite of the resources were destroyed correctly.
> 
> Hence, failsafe PMD was aborted, only in debug mode, when
> it tries to remove the device in plug-out process.
> 
> The workaround added option to replace all claim_zero
> assertions with debugging messages, by the way, this option
> affects non ibv destroy assertions.
> 
> DPDK 18.02 release should work with Mellanox OFED-4.2 which will
> include the verbs fix to this bug, then, this patch can
> be removed.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>

Applied, thanks
  

Patch

diff --git a/config/common_base b/config/common_base
index 7805605..5e97a08 100644
--- a/config/common_base
+++ b/config/common_base
@@ -213,6 +213,7 @@  CONFIG_RTE_LIBRTE_FM10K_INC_VECTOR=y
 #
 CONFIG_RTE_LIBRTE_MLX4_PMD=n
 CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
+CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS=n
 CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N=4
 CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst
index d5bf2b3..f8885b2 100644
--- a/doc/guides/nics/mlx4.rst
+++ b/doc/guides/nics/mlx4.rst
@@ -119,6 +119,14 @@  These options can be modified in the ``.config`` file.
   adds additional run-time checks and debugging messages at the cost of
   lower performance.
 
+- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS`` (default **n**)
+
+  Mellanox OFED versions earlier than 4.2 may return false errors from
+  Verbs object destruction APIs after the device is plugged out.
+  Enabling this option replaces assertion checks that cause the program
+  to abort with harmless debugging messages as a workaround.
+  Relevant only when CONFIG_RTE_LIBRTE_MLX4_DEBUG is enabled.
+
 - ``CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N`` (default **4**)
 
   Number of scatter/gather elements (SGEs) per work request (WR). Lowering
diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index 755c8a4..c045bd7 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -84,6 +84,10 @@  ifdef CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS
 CFLAGS += -DMLX4_PMD_SOFT_COUNTERS=$(CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS)
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS),y)
+CFLAGS += -DMLX4_PMD_DEBUG_BROKEN_VERBS
+endif
+
 include $(RTE_SDK)/mk/rte.lib.mk
 
 # Generate and clean-up mlx4_autoconf.h.
diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h
index a2e0ae7..c0ade4f 100644
--- a/drivers/net/mlx4/mlx4.h
+++ b/drivers/net/mlx4/mlx4.h
@@ -182,7 +182,13 @@  enum {
 		(DEBUG__(__VA_ARGS__), 0)	\
 	})[0])
 #define DEBUG(...) DEBUG_(__VA_ARGS__, '\n')
+#ifndef MLX4_PMD_DEBUG_BROKEN_VERBS
 #define claim_zero(...) assert((__VA_ARGS__) == 0)
+#else /* MLX4_PMD_DEBUG_BROKEN_VERBS */
+#define claim_zero(...) \
+	(void)(((__VA_ARGS__) == 0) || \
+		DEBUG("Assertion `(" # __VA_ARGS__ ") == 0' failed (IGNORED)."))
+#endif /* MLX4_PMD_DEBUG_BROKEN_VERBS */
 #define claim_nonzero(...) assert((__VA_ARGS__) != 0)
 #define claim_positive(...) assert((__VA_ARGS__) >= 0)
 #else /* NDEBUG */