1178 – Deadlock with heap lock

Bug 1178 - Deadlock with heap lock

Summary: Deadlock with heap lock

Status:	UNCONFIRMED

Alias:	None

Product:	DPDK
Classification:	Unclassified
Component:	core (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal normal
Target Milestone:	---
Assignee:	Anatoly Burakov

URL:

Depends on:
Blocks:

Reported:	2023-03-10 18:31 CET by Maxime Coquelin
Modified:	2023-03-13 13:41 CET (History)
CC List:	1 user (show)

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Maxime Coquelin 2023-03-10 18:31:54 CET

While doing test using testpmd with virtio-user PMD, I got a deadlock I managed to reproduce systematically.
It looks like my system was under hugepages pressure.
After a reboot, I cannot reproduce the deadlock anymore.

This is the backtrace I collected before having to reboot:

(gdb) bt
#0  0x0000000000b80e87 in rte_spinlock_lock (sl=0x100000100) at ../lib/eal/x86/include/rte_spinlock.h:29
#1  0x0000000000b827fb in malloc_heap_free (elem=0x1003a8b00) at ../lib/eal/common/malloc_heap.c:878
#2  0x0000000000b83ba5 in mem_free (addr=0x1003a8b80, trace_ena=true) at ../lib/eal/common/rte_malloc.c:37
#3  0x0000000000b83be9 in rte_free (addr=0x1003a8b80) at ../lib/eal/common/rte_malloc.c:44
#4  0x0000000003a2ce3d in virtio_user_free_vrings (dev=0x1003a8d00) at ../drivers/net/virtio/virtio_user/virtio_user_dev.c:652
#5  0x0000000003a2d5ae in virtio_user_dev_uninit (dev=0x1003a8d00) at ../drivers/net/virtio/virtio_user/virtio_user_dev.c:823
#6  0x0000000003a11a50 in virtio_user_dev_close (hw=0x1003a8d00) at ../drivers/net/virtio/virtio_user_ethdev.c:284
#7  0x00000000039dcbaf in virtio_dev_close (dev=0x98c0580 <rte_eth_devices>) at ../drivers/net/virtio/virtio_ethdev.c:382
#8  0x0000000000aecd0a in rte_eth_dev_close (port_id=0) at ../lib/ethdev/rte_ethdev.c:1720
#9  0x0000000003a129c6 in virtio_user_pmd_remove (vdev=0xa008b30) at ../drivers/net/virtio/virtio_user_ethdev.c:715
#10 0x0000000000c3fefd in vdev_cleanup () at ../drivers/bus/vdev/vdev.c:588
#11 0x0000000000b667eb in eal_bus_cleanup () at ../lib/eal/common/eal_common_bus.c:104
#12 0x0000000000b97431 in rte_eal_cleanup () at ../lib/eal/linux/eal.c:1374
#13 0x0000000000b67557 in rte_exit (exit_code=1, format=0x8b4f3f8 "rxq %d invalid - must be >= 0 && <= %u\n")
    at ../lib/eal/common/eal_common_debug.c:42
#14 0x000000000064822a in launch_args_parse (argc=4, argv=0x7fff45669a60) at ../app/test-pmd/parameters.c:1053
#15 0x000000000065322a in main (argc=4, argv=0x7fff45669a60) at ../app/test-pmd/testpmd.c:4564
(gdb) info threads
  Id   Target Id                                            Frame
* 1    Thread 0x7f7eb7841c00 (LWP 386037) "dpdk-testpmd"    0x0000000000b80e87 in rte_spinlock_lock (sl=0x100000100)
    at ../lib/eal/x86/include/rte_spinlock.h:29
  2    Thread 0x7f7eb6fff400 (LWP 386038) "eal-intr-thread" 0x00007f7eb7111c0e in epoll_wait (epfd=6, events=0x7f7eb6ffc440, maxevents=1,
    timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  3    Thread 0x7f7eb5ffd400 (LWP 386040) "rte-worker-7"    __GI___libc_read (nbytes=1, buf=0x7f7eb5ffa333, fd=10)
    at ../sysdeps/unix/sysv/linux/read.c:26
  4    Thread 0x7f7eb57fc400 (LWP 386041) "telemetry-v2"    0x00007f7eb711323f in __libc_accept (fd=25, addr=..., len=0x0)
    at ../sysdeps/unix/sysv/linux/accept.c:26
(gdb) t 2
[Switching to thread 2 (Thread 0x7f7eb6fff400 (LWP 386038))]
#0  0x00007f7eb7111c0e in epoll_wait (epfd=6, events=0x7f7eb6ffc440, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
Downloading 0.00 MB source file /usr/src/debug/glibc-2.35-22.fc36.x86_64/misc/../sysdeps/unix/sysv/linux/epoll_wait.c
30	  return SYSCALL_CANCEL (epoll_wait, epfd, events, maxevents, timeout);
(gdb) bt
#0  0x00007f7eb7111c0e in epoll_wait (epfd=6, events=0x7f7eb6ffc440, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x0000000000b9ce94 in eal_intr_handle_interrupts (pfd=6, totalfds=1) at ../lib/eal/linux/eal_interrupts.c:1077
#2  0x0000000000b9d0c7 in eal_intr_thread_main (arg=0x0) at ../lib/eal/linux/eal_interrupts.c:1163
#3  0x0000000000b7d6a6 in ctrl_thread_start (arg=0xa024ec0) at ../lib/eal/common/eal_common_thread.c:276
#4  0x00007f7eb708cdcd in start_thread (arg=<optimized out>) at pthread_create.c:442
#5  0x00007f7eb7112630 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) t 3
[Switching to thread 3 (Thread 0x7f7eb5ffd400 (LWP 386040))]
#0  __GI___libc_read (nbytes=1, buf=0x7f7eb5ffa333, fd=10) at ../sysdeps/unix/sysv/linux/read.c:26
Downloading 0.00 MB source file /usr/src/debug/glibc-2.35-22.fc36.x86_64/io/../sysdeps/unix/sysv/linux/read.c
26	  return SYSCALL_CANCEL (read, fd, buf, nbytes);
(gdb) bt
#0  __GI___libc_read (nbytes=1, buf=0x7f7eb5ffa333, fd=10) at ../sysdeps/unix/sysv/linux/read.c:26
#1  __GI___libc_read (fd=10, buf=0x7f7eb5ffa333, nbytes=1) at ../sysdeps/unix/sysv/linux/read.c:24
#2  0x0000000000b94220 in eal_thread_wait_command () at ../lib/eal/unix/eal_unix_thread.c:44
#3  0x0000000000b7d497 in eal_thread_loop (arg=0x7) at ../lib/eal/common/eal_common_thread.c:189
#4  0x0000000000b96580 in eal_worker_thread_loop (arg=0x7) at ../lib/eal/linux/eal.c:915
#5  0x00007f7eb708cdcd in start_thread (arg=<optimized out>) at pthread_create.c:442
#6  0x00007f7eb7112630 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) t 4
[Switching to thread 4 (Thread 0x7f7eb57fc400 (LWP 386041))]
#0  0x00007f7eb711323f in __libc_accept (fd=25, addr=..., len=0x0) at ../sysdeps/unix/sysv/linux/accept.c:26
Downloading 0.00 MB source file /usr/src/debug/glibc-2.35-22.fc36.x86_64/socket/../sysdeps/unix/sysv/linux/accept.c
26	  return SYSCALL_CANCEL (accept, fd, addr.__sockaddr__, len);
(gdb) bt
#0  0x00007f7eb711323f in __libc_accept (fd=25, addr=..., len=0x0) at ../sysdeps/unix/sysv/linux/accept.c:26
#1  0x0000000000bac798 in socket_listener (socket=0x99f3220 <v2_socket>) at ../lib/telemetry/telemetry.c:405
#2  0x00007f7eb708cdcd in start_thread (arg=<optimized out>) at pthread_create.c:442
#3  0x00007f7eb7112630 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

So the deadlock happens on heap->lock.
Looking at the usage of this lock I found something... interesting:

int
rte_malloc_heap_destroy(const char *heap_name)
{
	struct malloc_heap *heap = NULL;
	int ret;

	if (heap_name == NULL ||
			strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) == 0 ||
			strnlen(heap_name, RTE_HEAP_NAME_MAX_LEN) ==
				RTE_HEAP_NAME_MAX_LEN) {
		rte_errno = EINVAL;
		return -1;
	}
	rte_mcfg_mem_write_lock();

	/* start from non-socket heaps */
	heap = find_named_heap(heap_name);
	if (heap == NULL) {
		RTE_LOG(ERR, EAL, "Heap %s not found\n", heap_name);
		rte_errno = ENOENT;
		ret = -1;
		goto unlock;
	}
	/* we shouldn't be able to destroy internal heaps */
	if (heap->socket_id < RTE_MAX_NUMA_NODES) {
		rte_errno = EPERM;
		ret = -1;
		goto unlock;
	}
	/* sanity checks done, now we can destroy the heap */
	rte_spinlock_lock(&heap->lock);
	ret = malloc_heap_destroy(heap);

	/* if we failed, lock is still active */
	if (ret < 0)
		rte_spinlock_unlock(&heap->lock);
unlock:
	rte_mcfg_mem_write_unlock();

	return ret;
}

If malloc_heap_destroy() fails, we don't unlock the spinlock.
It does not look sane, why not just unlock and propagate the error?
How can it be preferable to freeze the application without letting the user know something went wrong?

The commit introducing this is not explicit about this choice, so I'd like to understand the reasoning before posting a fix:

commit 15d6dd023c29fc526d16616137844bd91b6afdfe
Author: Anatoly Burakov <anatoly.burakov@intel.com>
Date:   Tue Oct 2 14:34:50 2018 +0100

    malloc: allow destroying heaps

    Add an API to destroy specified heap.

    Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

Thanks in advance,
Maxime

Comment 1 Maxime Coquelin 2023-03-13 10:10:33 CET

Looking more closely at rte_malloc_heap_destroy() with David,
the function is not buggy, the lock is implicitly unlocked in 
malloc_heap_destroy() when doing the memset to 0.
David's patch [0] moving the lock handling in malloc_heap_destroy(),
and using spinlock API to unlock it makes it clearer.

Furthermore, the API is not used with test-pmd, so it is not related
to the issue I faced. Since other heap->lock usage look sane, the issue
faced may be due to a memory corruption.

I let the Bz open for now. In case someone manages to reproduce.

[0]: https://patchwork.dpdk.org/project/dpdk/patch/20230224151143.3274897-2-david.marchand@redhat.com/

Note You need to log in before you can comment on or make changes to this bug.