[dpdk-stable] [PATCH] mem: fix allocation failure on non-NUMA kernel
Nick Connolly
nick.connolly at mayadata.io
Thu Sep 17 15:05:44 CEST 2020
Hi Anatoly,
Thanks. My recollection is that all of the NUMA configuration flags
were set to 'n'.
Regards,
Nick
On 17/09/2020 13:57, Burakov, Anatoly wrote:
> On 17-Sep-20 1:29 PM, Nick Connolly wrote:
>> Hi Anatoly,
>>
>> Thanks for the response. You are asking a good question - here's
>> what I know:
>>
>> The issue arose on a single socket system, running WSL2 (full Linux
>> kernel running as a lightweight VM under Windows).
>> The default kernel in this environment is built with CONFIG_NUMA=n
>> which means get_mempolicy() returns an error.
>> This causes the check to ensure that the allocated memory is
>> associated with the correct socket to fail.
>>
>> The change is to skip the allocation check if check_numa() indicates
>> that NUMA-aware memory is not supported.
>>
>> Researching the meaning of CONFIG_NUMA, I found
>> https://cateee.net/lkddb/web-lkddb/NUMA.html which says:
>>> Enable NUMA (Non-Uniform Memory Access) support.
>>> The kernel will try to allocate memory used by a CPU on the local
>>> memory controller of the CPU and add some more NUMA awareness to the
>>> kernel.
>>
>> Clearly CONFIG_NUMA enables memory awareness, but there's no
>> indication in the description whether information about the NUMA
>> physical architecture is 'hidden', or whether it is still exposed
>> through /sys/devices/system/node* (which is used by the rte
>> initialisation code to determine how many sockets there are).
>> Unfortunately, I don't have ready access to a multi-socket Linux
>> system that I can test this out on, so I took the conservative
>> approach that it may be possible to have CONFIG_NUMA disabled, but
>> the kernel still report more than one node, and coded the change to
>> generate a debug message if this occurs.
>>
>> Do you know whether CONFIG_NUMA turns off all knowledge about the
>> hardware architecture? If it does, then I agree that the test for
>> rte_socket_count() serves no purpose and should be removed.
>>
>
> I have a system with a custom compiled kernel, i can recompile it
> without this flag and test this. I'll report back with results :)
>
More information about the stable
mailing list