patch 'eal: fix data race in multi-process support' has been queued to stable release 20.11.7

luca.boccassi at gmail.com luca.boccassi at gmail.com
Thu Nov 3 10:27:56 CET 2022


Hi,

FYI, your patch has been queued to stable release 20.11.7

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 11/05/22. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://github.com/kevintraynor/dpdk-stable

This queued commit can be viewed at:
https://github.com/kevintraynor/dpdk-stable/commit/4226f0bc512acc8f9e718c8f4599e2096febe306

Thanks.

Luca Boccassi

---
>From 4226f0bc512acc8f9e718c8f4599e2096febe306 Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <stephen at networkplumber.org>
Date: Tue, 6 Sep 2022 09:45:22 -0700
Subject: [PATCH] eal: fix data race in multi-process support

[ upstream commit 668958f3c1617f18e04ffee099656e7fb2effa94 ]

If DPDK is built with thread sanitizer it reports a race
in setting of multiprocess file descriptor. The fix is to
use atomic operations when updating mp_fd.

Build:
$ meson -Db_sanitize=address build
$ ninja -C build

Simple example:
$ .build/app/dpdk-testpmd -l 1-3 --no-huge
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
testpmd: No probed ethernet devices
testpmd: create a new mbuf pool <mb_pool_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: Error - exiting with code: 1
  Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
==================
WARNING: ThreadSanitizer: data race (pid=87245)
  Write of size 4 at 0x558e04d8ff70 by main thread:
    #0 rte_mp_channel_cleanup <null> (dpdk-testpmd+0x1e7d30c)
    #1 rte_eal_cleanup <null> (dpdk-testpmd+0x1e85929)
    #2 rte_exit <null> (dpdk-testpmd+0x1e5bc0a)
    #3 mbuf_pool_create.cold <null> (dpdk-testpmd+0x274011)
    #4 main <null> (dpdk-testpmd+0x5cc15d)

  Previous read of size 4 at 0x558e04d8ff70 by thread T2:
    #0 mp_handle <null> (dpdk-testpmd+0x1e7c439)
    #1 ctrl_thread_init <null> (dpdk-testpmd+0x1e6ee1e)

  As if synchronized via sleep:
    #0 nanosleep libsanitizer/tsan/tsan_interceptors_posix.cpp:366
    #1 get_tsc_freq <null> (dpdk-testpmd+0x1e92ff9)
    #2 set_tsc_freq <null> (dpdk-testpmd+0x1e6f2fc)
    #3 rte_eal_timer_init <null> (dpdk-testpmd+0x1e931a4)
    #4 rte_eal_init.cold <null> (dpdk-testpmd+0x29e578)
    #5 main <null> (dpdk-testpmd+0x5cbc45)

  Location is global 'mp_fd' of size 4 at 0x558e04d8ff70 (dpdk-testpmd+0x000003122f70)

  Thread T2 'rte_mp_handle' (tid=87248, running) created by main thread at:
    #0 pthread_create libsanitizer/tsan/tsan_interceptors_posix.cpp:969
    #1 rte_ctrl_thread_create <null> (dpdk-testpmd+0x1e6efd0)
    #2 rte_mp_channel_init.cold <null> (dpdk-testpmd+0x29cb7c)
    #3 rte_eal_init <null> (dpdk-testpmd+0x1e8662e)
    #4 main <null> (dpdk-testpmd+0x5cbc45)

SUMMARY: ThreadSanitizer: data race (app/dpdk-testpmd+0x1e7d30c) in rte_mp_channel_cleanup
==================
ThreadSanitizer: reported 1 warnings

Fixes: bacaa2754017 ("eal: add channel for multi-process communication")

Signed-off-by: Stephen Hemminger <stephen at networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov at intel.com>
Reviewed-by: Chengwen Feng <fengchengwen at huawei.com>
---
 lib/librte_eal/common/eal_common_proc.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index b33d58ea0a..50f0668148 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -262,7 +262,7 @@ rte_mp_action_unregister(const char *name)
 }
 
 static int
-read_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
+read_msg(int fd, struct mp_msg_internal *m, struct sockaddr_un *s)
 {
 	int msglen;
 	struct iovec iov;
@@ -283,7 +283,7 @@ read_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
 	msgh.msg_controllen = sizeof(control);
 
 retry:
-	msglen = recvmsg(mp_fd, &msgh, 0);
+	msglen = recvmsg(fd, &msgh, 0);
 
 	/* zero length message means socket was closed */
 	if (msglen == 0)
@@ -392,11 +392,12 @@ mp_handle(void *arg __rte_unused)
 {
 	struct mp_msg_internal msg;
 	struct sockaddr_un sa;
+	int fd;
 
-	while (mp_fd >= 0) {
+	while ((fd = __atomic_load_n(&mp_fd, __ATOMIC_RELAXED)) >= 0) {
 		int ret;
 
-		ret = read_msg(&msg, &sa);
+		ret = read_msg(fd, &msg, &sa);
 		if (ret <= 0)
 			break;
 
@@ -640,9 +641,8 @@ rte_mp_channel_init(void)
 			NULL, mp_handle, NULL) < 0) {
 		RTE_LOG(ERR, EAL, "failed to create mp thread: %s\n",
 			strerror(errno));
-		close(mp_fd);
 		close(dir_fd);
-		mp_fd = -1;
+		close(__atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED));
 		return -1;
 	}
 
@@ -658,11 +658,10 @@ rte_mp_channel_cleanup(void)
 {
 	int fd;
 
-	if (mp_fd < 0)
+	fd = __atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED);
+	if (fd < 0)
 		return;
 
-	fd = mp_fd;
-	mp_fd = -1;
 	pthread_cancel(mp_handle_tid);
 	pthread_join(mp_handle_tid, NULL);
 	close_socket_fd(fd);
-- 
2.34.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2022-11-03 09:27:31.478944906 +0000
+++ 0098-eal-fix-data-race-in-multi-process-support.patch	2022-11-03 09:27:25.569426235 +0000
@@ -1 +1 @@
-From 668958f3c1617f18e04ffee099656e7fb2effa94 Mon Sep 17 00:00:00 2001
+From 4226f0bc512acc8f9e718c8f4599e2096febe306 Mon Sep 17 00:00:00 2001
@@ -5,0 +6,2 @@
+[ upstream commit 668958f3c1617f18e04ffee099656e7fb2effa94 ]
+
@@ -62 +63,0 @@
-Cc: stable at dpdk.org
@@ -68 +69 @@
- lib/eal/common/eal_common_proc.c | 17 ++++++++---------
+ lib/librte_eal/common/eal_common_proc.c | 17 ++++++++---------
@@ -71,5 +72,5 @@
-diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
-index 313060528f..1fc1d6c53b 100644
---- a/lib/eal/common/eal_common_proc.c
-+++ b/lib/eal/common/eal_common_proc.c
-@@ -260,7 +260,7 @@ rte_mp_action_unregister(const char *name)
+diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
+index b33d58ea0a..50f0668148 100644
+--- a/lib/librte_eal/common/eal_common_proc.c
++++ b/lib/librte_eal/common/eal_common_proc.c
+@@ -262,7 +262,7 @@ rte_mp_action_unregister(const char *name)
@@ -84 +85 @@
-@@ -281,7 +281,7 @@ read_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
+@@ -283,7 +283,7 @@ read_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
@@ -93 +94 @@
-@@ -390,11 +390,12 @@ mp_handle(void *arg __rte_unused)
+@@ -392,11 +392,12 @@ mp_handle(void *arg __rte_unused)
@@ -108 +109 @@
-@@ -638,9 +639,8 @@ rte_mp_channel_init(void)
+@@ -640,9 +641,8 @@ rte_mp_channel_init(void)
@@ -119 +120 @@
-@@ -656,11 +656,10 @@ rte_mp_channel_cleanup(void)
+@@ -658,11 +658,10 @@ rte_mp_channel_cleanup(void)


More information about the stable mailing list