Bug 1206 - Multiple large memory block allocations using rte_malloc can lead to memory out-of-bounds issues.
Summary: Multiple large memory block allocations using rte_malloc can lead to memory o...
Status: UNCONFIRMED
Alias: None
Product: DPDK
Classification: Unclassified
Component: core (show other bugs)
Version: 21.11
Hardware: x86 Linux
: Normal major
Target Milestone: ---
Assignee: dev
URL:
Depends on:
Blocks:
 
Reported: 2023-03-31 03:17 CEST by killers
Modified: 2023-06-16 16:58 CEST (History)
1 user (show)



Attachments

Description killers 2023-03-31 03:17:50 CEST
[root@localhost bin]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Stepping:              9
CPU MHz:               3700.073
CPU max MHz:           3900.0000
CPU min MHz:           1600.0000
BogoMIPS:              6784.24
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
[root@localhost bin]# 

Not supported pdpe1gb



There are many free 2M HugePages.

HugePages_Total:    6656
HugePages_Free:     5682
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      236476 kB
DirectMap2M:    33228800 kB




test code

  char * t_mem1;
  char * t_mem2;
	int t_size = 1024*1024*1024;
	t_mem1 = rte_malloc(NULL,t_size,RTE_CACHE_LINE_SIZE);
	t_mem2 = rte_malloc(NULL,t_size,RTE_CACHE_LINE_SIZE);
	printf("rte_malloc1 t_mem1=%p \n",t_mem1);
	printf("rte_malloc1 t_mem2=%p \n",t_mem2);	

	memset(t_mem1,0,t_size);
	memset(t_mem2,1,t_size);	

	int t_i;
	for(t_i=0;t_i<t_size;t_i++)
	{
		if (t_mem1[t_i] ==1)
		{
			printf("rte_malloc find t_mem1=%p error t_i=%d %p=%d\n",t_mem1, t_i,&t_mem1[t_i],t_mem1[t_i] );
			t_mem1[t_i] = 2;
			break;
		}
	}
	for(t_i=0;t_i<t_size;t_i++)
	{
		if (t_mem2[t_i] ==2)
		{
			printf("rte_malloc find t_mem2=%p error t_i=%d %p=%d\n",t_mem2, t_i,&t_mem2[t_i],t_mem2[t_i] );
			break;
		}
	}
	
run print:

rte_malloc1 t_mem1=0x107c00000 
rte_malloc1 t_mem2=0x140c00000 
rte_malloc find t_mem1=0x107c00000 error t_i=956301312 0x140c00000=1
rte_malloc find t_mem2=0x140c00000 error t_i=0 0x140c00000=2

The two allocated blocks of memory overlap partially.
Comment 1 Dmitry Kozlyuk 2023-05-09 18:01:30 CEST
Unable to reproduce in 21.11 or main.
It's weird that (t_mem2 - t_mem1) = 912 MB precisely on your system.
It should be and it is 1024 MB + 192 B (requested + overhead) on my system.
Tried both with and without -Dc_args=-DRTE_MALLOC_DEBUG=1.
What is the distribution, compiler version, build flags?
Comment 2 killers 2023-06-16 10:47:39 CEST
The problem occurs only on low-end CPUs that lack hardware support for PDPE1GB. It resurfaces when using 2M Hugepages. DPDK 20 also experiences this issue!
Comment 3 killers 2023-06-16 10:50:00 CEST
When using 2M hugepages, there is an issue with allocating large chunks of memory, such as allocating several hundred megabytes of memory.
Comment 4 Dmitry Kozlyuk 2023-06-16 16:58:43 CEST
I ran my tests on a system with pdpe1gb support, but without 1G hugepages. For the sample code, DPDK allocated memory in two chunks of 513 x 2M hugepages. On v23.03-23-gd034467249:

rte_malloc1 t_mem1=0x11805fffc0 
rte_malloc1 t_mem2=0x11c07fffc0

Computing (t_mem2 - tmem1) = 0x11c07fffc0 - 0x11805fffc0 = 1073741824 + 2097152 = 1G + 2M exactly, which is OK.

Please tell the distribution (uname -a), compiler version, and build flags.

Note You need to log in before you can comment on or make changes to this bug.