[PATCH v5] enhance NUMA affinity heuristic

Thomas Monjalon thomas at monjalon.net
Fri Mar 3 15:07:14 CET 2023


I'm not comfortable with this patch.

First, there is no comment in the code which helps to understand the logic.
Second, I'm afraid changing the value of the per-core variable _socket_id
may have an impact on some applications.


16/02/2023 03:50, You, KaisenX:
> From: Burakov, Anatoly <anatoly.burakov at intel.com>
> > On 2/1/2023 12:20 PM, Kaisen You wrote:
> > > Trying to allocate memory on the first detected numa node has less
> > > chance to find some memory actually available rather than on the main
> > > lcore numa node (especially when the DPDK application is started only
> > > on one numa node).
> > >
> > > Fixes: 705356f0811f ("eal: simplify control thread creation")
> > > Fixes: bb0bd346d5c1 ("eal: suggest using --lcores option")
> > > Cc: stable at dpdk.org
> > >
> > > Signed-off-by: David Marchand <david.marchand at redhat.com>
> > > Signed-off-by: Kaisen You <kaisenx.you at intel.com>
> > > ---
> > > Changes since v4:
> > > - mod the patch title,
> > >
> > > Changes since v3:
> > > - add the assignment of socket_id in thread initialization,
> > >
> > > Changes since v2:
> > > - add uncommitted local change and fix compilation,
> > >
> > > Changes since v1:
> > > - accomodate for configurations with main lcore running on multiples
> > >    physical cores belonging to different numa,
> > > ---
> > >   lib/eal/common/eal_common_thread.c | 1 +
> > >   lib/eal/common/malloc_heap.c       | 4 ++++
> > >   2 files changed, 5 insertions(+)
> > >
> > > diff --git a/lib/eal/common/eal_common_thread.c
> > > b/lib/eal/common/eal_common_thread.c
> > > index 38d83a6885..21bff971f8 100644
> > > --- a/lib/eal/common/eal_common_thread.c
> > > +++ b/lib/eal/common/eal_common_thread.c
> > > @@ -251,6 +251,7 @@ static void *ctrl_thread_init(void *arg)
> > >   	void *routine_arg = params->arg;
> > >
> > >   	__rte_thread_init(rte_lcore_id(), cpuset);
> > > +	RTE_PER_LCORE(_socket_id) = SOCKET_ID_ANY;
> > >   	params->ret = rte_thread_set_affinity_by_id(rte_thread_self(),
> > cpuset);
> > >   	if (params->ret != 0) {
> > >   		__atomic_store_n(&params->ctrl_thread_status,
> > > diff --git a/lib/eal/common/malloc_heap.c
> > > b/lib/eal/common/malloc_heap.c index d7c410b786..3ee19aee15 100644
> > > --- a/lib/eal/common/malloc_heap.c
> > > +++ b/lib/eal/common/malloc_heap.c
> > > @@ -717,6 +717,10 @@ malloc_get_numa_socket(void)
> > >   			return socket_id;
> > >   	}
> > >
> > > +	socket_id = rte_lcore_to_socket_id(rte_get_main_lcore());
> > > +	if (socket_id != (unsigned int)SOCKET_ID_ANY)
> > > +		return socket_id;
> > > +
> > >   	return rte_socket_id_by_idx(0);
> > >   }
> > >
> > 
> > I may be lacking context, but I don't quite get the suggested change.
> >  From what I understand, the original has to do with assigning lcore cpusets in
> > such a way that an lcore ends up having two socket ID's (because it's been
> > assigned to CPU's on different sockets). Why is this allowed in the first place?
> > It seems like a user error to me, as it breaks many of the fundamental
> > assumptions DPDK makes.
> > 
> In a dual socket system, if all used cores are in socket 1 and the NIC is in socket 1, 
> no memory is allocated for socket 0. This is to optimize memory consumption.
> 
> I agree with you. If the startup parameters can ensure that both sockets 
> allocate memory, there will be no problem.
> However, due to the different CPU topologies of different systems, 
> It is difficult for users to ensure that the startup parameter contains two cpu nodes.
> 
> > I'm fine with using main lcore socket for control threads, I just don't think the
> > `socket_id != SOCKET_ID_ANY` thing should be checked here, because it
> > apparently tries to compensate for a problem with cpuset of the main thread,
> > which shouldn't have happened to begin with.
> > 
> This issue has been explained in detail in the discussion of the patch v1 version. 
> I will forward the previous email to you. The content of the email will also better 
> let you know the purpose of submitting this patch.





More information about the stable mailing list