[dpdk-dev] [PATCH] mem: balanced allocation of hugepages

Ilya Maximets i.maximets at samsung.com
Thu Feb 16 14:55:57 CET 2017


Hi,

On 16.02.2017 16:26, Tan, Jianfeng wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Ilya Maximets [mailto:i.maximets at samsung.com]
>> Sent: Thursday, February 16, 2017 9:01 PM
>> To: dev at dpdk.org; David Marchand; Gonzalez Monroy, Sergio
>> Cc: Heetae Ahn; Yuanhan Liu; Tan, Jianfeng; Neil Horman; Pei, Yulong; Ilya
>> Maximets; stable at dpdk.org
>> Subject: [PATCH] mem: balanced allocation of hugepages
>>
>> Currently EAL allocates hugepages one by one not paying
>> attention from which NUMA node allocation was done.
>>
>> Such behaviour leads to allocation failure if number of
>> available hugepages for application limited by cgroups
>> or hugetlbfs and memory requested not only from the first
>> socket.
>>
>> Example:
>> 	# 90 x 1GB hugepages availavle in a system
>>
>> 	cgcreate -g hugetlb:/test
>> 	# Limit to 32GB of hugepages
>> 	cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test
>> 	# Request 4GB from each of 2 sockets
>> 	cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ...
>>
>> 	EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
>> 	EAL: 32 not 90 hugepages of size 1024 MB allocated
>> 	EAL: Not enough memory available on socket 1!
>> 	     Requested: 4096MB, available: 0MB
>> 	PANIC in rte_eal_init():
>> 	Cannot init memory
>>
>> 	This happens beacause all allocated pages are
>> 	on socket 0.
> 
> For such an use case, why not just use "numactl --interleave=0,1 <DPDK app> xxx"?

Unfortunately, interleave policy doesn't work for me. I suspect kernel configuration
blocks this or I don't understand something in kernel internals.
I'm using 3.10 rt kernel from rhel7.

I tried to set up MPOL_INTERLEAVE in code and it doesn't work for me. Your example
with numactl doesn't work too:

# Limited to 8GB of hugepages
cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 

EAL: Setting up physically contiguous memory...
EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
EAL: 8 not 90 hugepages of size 1024 MB allocated
EAL: Hugepage /dev/hugepages/rtemap_0 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_1 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_2 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_3 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_4 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_5 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_6 is on socket 0
EAL: Hugepage /dev/hugepages/rtemap_7 is on socket 0
EAL: Not enough memory available on socket 1! Requested: 4096MB, available: 0MB
PANIC in rte_eal_init():
Cannot init memory

Also, using numactl will affect all the allocations in application. This may
cause additional unexpected issues.

> 
> Do you see use case like --socket-mem 2048,1024 and only three 1GB-hugepage are allowed?

This case will work with my patch.
But the opposite one '--socket-mem=1024,2048' will fail.
To be clear, we need to allocate all required memory at first
from each numa node and then allocate all other available pages
in round-robin fashion. But such solution looks a little ugly.

What do you think?

Best regards, Ilya Maximets.


More information about the dev mailing list