eal: Place EAL thread stack in a reserved per-lcore memzone

Message ID 1586768952-10554-1-git-send-email-ricudis@niometrics.com (mailing list archive)
State Changes Requested, archived
Headers
Series eal: Place EAL thread stack in a reserved per-lcore memzone |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed

Commit Message

Christos Ricudis April 13, 2020, 9:09 a.m. UTC
  Reserve a per-lcore 4MB memzone and allocate thread stack of EAL threads there for better NUMA locality of stack-allocated variables

Signed-off-by: Christos Ricudis <ricudis@niometrics.com>
---
 lib/librte_eal/linux/eal.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)
  

Comments

Jerin Jacob April 13, 2020, 9:45 a.m. UTC | #1
On Mon, Apr 13, 2020 at 2:39 PM Christos Ricudis <ricudis@niometrics.com> wrote:
>
> Reserve a per-lcore 4MB memzone and allocate thread stack of EAL threads there for better NUMA locality of stack-allocated variables

It looks like a good idea to me.

Some questions/feedback.

1) It is better to get the stack size from OS  through
pthread_attr_getstack() rather than DPDK defining it.
2) There is an element of security issue here as one can get the lcore
stack using rte_memzone_lookup().
Why we need to use memzone, just rte_malloc_socket() is enough here.
Right? This will avoid the security issue.
3) Need to handle EAL --no-huge case as well.



>
> Signed-off-by: Christos Ricudis <ricudis@niometrics.com>
> ---
>  lib/librte_eal/linux/eal.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
> index 9530ee5..e047107 100644
> --- a/lib/librte_eal/linux/eal.c
> +++ b/lib/librte_eal/linux/eal.c
> @@ -68,6 +68,8 @@
>
>  #define KERNEL_IOMMU_GROUPS_PATH "/sys/kernel/iommu_groups"
>
> +#define THREAD_STACK_SIZE_DEFAULT (4ULL * 1024ULL * 1024ULL)
> +#include <rte_memzone.h>
>  /* Allow the application to print its usage message too if set */
>  static rte_usage_hook_t        rte_application_usage_hook = NULL;
>
> @@ -1224,6 +1226,24 @@ static void rte_eal_init_alert(const char *msg)
>
>                 lcore_config[i].state = WAIT;
>
> +               pthread_attr_t attr;
> +               pthread_attr_init(&attr);
> +               size_t thread_stack_size = THREAD_STACK_SIZE_DEFAULT;
> +               char thread_stack_name[64];
> +               snprintf(thread_stack_name, sizeof thread_stack_name, "rte:lcore:%s:%d:threadstack", rte_eal_process_type() == RTE_PROC_PRIMARY ? "p" : "s", i);
> +               const struct rte_memzone *mz = rte_memzone_lookup(thread_stack_name);
> +               if (mz == NULL) {
> +                       if ((mz = rte_memzone_reserve(thread_stack_name, thread_stack_size, lcore_config[i].socket_id, 0)) == NULL) {
> +                               rte_panic("Cannot allocate memzone for thread stack");
> +                       }
> +               }
> +               void *thread_stack = mz->addr;
> +
> +               if (pthread_attr_setstack(&attr, thread_stack, thread_stack_size) < 0) {
> +                       rte_panic("Cannot set thread stack\n");
> +               }
> +               RTE_LOG(DEBUG, EAL, "Thread stack for lcore %d on socket %d set to %p\n", i, lcore_config[i].socket_id, thread_stack);
> +
>                 /* create a thread for each lcore */
>                 ret = pthread_create(&lcore_config[i].thread_id, NULL,
>                                      eal_thread_loop, NULL);
> --
> 1.8.3.1
>
  
Ananyev, Konstantin April 14, 2020, 11:23 a.m. UTC | #2
Hi,

> 
> Reserve a per-lcore 4MB memzone and allocate thread stack of EAL threads there for better NUMA locality of stack-allocated variables

I wonder if there any real performance improvement seen with that change?
Any case (existing DPDK app/example) that can demonstrate it? 
Konstantin

> 
> Signed-off-by: Christos Ricudis <ricudis@niometrics.com>
> ---
>  lib/librte_eal/linux/eal.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
> index 9530ee5..e047107 100644
> --- a/lib/librte_eal/linux/eal.c
> +++ b/lib/librte_eal/linux/eal.c
> @@ -68,6 +68,8 @@
> 
>  #define KERNEL_IOMMU_GROUPS_PATH "/sys/kernel/iommu_groups"
> 
> +#define THREAD_STACK_SIZE_DEFAULT (4ULL * 1024ULL * 1024ULL)
> +#include <rte_memzone.h>
>  /* Allow the application to print its usage message too if set */
>  static rte_usage_hook_t	rte_application_usage_hook = NULL;
> 
> @@ -1224,6 +1226,24 @@ static void rte_eal_init_alert(const char *msg)
> 
>  		lcore_config[i].state = WAIT;
> 
> +		pthread_attr_t attr;
> +		pthread_attr_init(&attr);
> +		size_t thread_stack_size = THREAD_STACK_SIZE_DEFAULT;
> +		char thread_stack_name[64];
> +		snprintf(thread_stack_name, sizeof thread_stack_name, "rte:lcore:%s:%d:threadstack", rte_eal_process_type() ==
> RTE_PROC_PRIMARY ? "p" : "s", i);
> +		const struct rte_memzone *mz = rte_memzone_lookup(thread_stack_name);
> +		if (mz == NULL) {
> +			if ((mz = rte_memzone_reserve(thread_stack_name, thread_stack_size, lcore_config[i].socket_id, 0)) == NULL) {
> +				rte_panic("Cannot allocate memzone for thread stack");
> +			}
> +		}
> +		void *thread_stack = mz->addr;
> +
> +		if (pthread_attr_setstack(&attr, thread_stack, thread_stack_size) < 0) {
> +			rte_panic("Cannot set thread stack\n");
> +		}
> +		RTE_LOG(DEBUG, EAL, "Thread stack for lcore %d on socket %d set to %p\n", i, lcore_config[i].socket_id, thread_stack);
> +
>  		/* create a thread for each lcore */
>  		ret = pthread_create(&lcore_config[i].thread_id, NULL,
>  				     eal_thread_loop, NULL);
> --
> 1.8.3.1
  
Pavan Nikhilesh Bhagavatula April 27, 2020, 4:51 p.m. UTC | #3
> lib/librte_eal/linux/eal.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
>diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
>index 9530ee5..e047107 100644
>--- a/lib/librte_eal/linux/eal.c
>+++ b/lib/librte_eal/linux/eal.c
>@@ -68,6 +68,8 @@
>
> #define KERNEL_IOMMU_GROUPS_PATH
>"/sys/kernel/iommu_groups"
>
>+#define THREAD_STACK_SIZE_DEFAULT (4ULL * 1024ULL * 1024ULL)
>+#include <rte_memzone.h>
> /* Allow the application to print its usage message too if set */
> static rte_usage_hook_t	rte_application_usage_hook = NULL;
>
>@@ -1224,6 +1226,24 @@ static void rte_eal_init_alert(const char
>*msg)
>
> 		lcore_config[i].state = WAIT;
>
>+		pthread_attr_t attr;
>+		pthread_attr_init(&attr);
>+		size_t thread_stack_size =
>THREAD_STACK_SIZE_DEFAULT;
>+		char thread_stack_name[64];
>+		snprintf(thread_stack_name, sizeof
>thread_stack_name, "rte:lcore:%s:%d:threadstack",
>rte_eal_process_type() == RTE_PROC_PRIMARY ? "p" : "s", i);
>+		const struct rte_memzone *mz =
>rte_memzone_lookup(thread_stack_name);
>+		if (mz == NULL) {
>+			if ((mz =
>rte_memzone_reserve(thread_stack_name, thread_stack_size,
>lcore_config[i].socket_id, 0)) == NULL) {
>+				rte_panic("Cannot allocate memzone
>for thread stack");
>+			}
>+		}
>+		void *thread_stack = mz->addr;
>+
>+		if (pthread_attr_setstack(&attr, thread_stack,
>thread_stack_size) < 0) {
>+			rte_panic("Cannot set thread stack\n");
>+		}
>+		RTE_LOG(DEBUG, EAL, "Thread stack for lcore %d on
>socket %d set to %p\n", i, lcore_config[i].socket_id, thread_stack);
>+
> 		/* create a thread for each lcore */
> 		ret = pthread_create(&lcore_config[i].thread_id, NULL,
> 				     eal_thread_loop, NULL);

Don't we need to pass attr struct created above to ptherad_create as 2nd argument?.

Also, since there is no way to modify master_lcore stack space most of DPDK testsuite wouldn't show any difference.

Pavan.

>--
>1.8.3.1
  

Patch

diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
index 9530ee5..e047107 100644
--- a/lib/librte_eal/linux/eal.c
+++ b/lib/librte_eal/linux/eal.c
@@ -68,6 +68,8 @@ 
 
 #define KERNEL_IOMMU_GROUPS_PATH "/sys/kernel/iommu_groups"
 
+#define THREAD_STACK_SIZE_DEFAULT (4ULL * 1024ULL * 1024ULL)
+#include <rte_memzone.h>
 /* Allow the application to print its usage message too if set */
 static rte_usage_hook_t	rte_application_usage_hook = NULL;
 
@@ -1224,6 +1226,24 @@  static void rte_eal_init_alert(const char *msg)
 
 		lcore_config[i].state = WAIT;
 
+		pthread_attr_t attr;
+		pthread_attr_init(&attr);
+		size_t thread_stack_size = THREAD_STACK_SIZE_DEFAULT;
+		char thread_stack_name[64];
+		snprintf(thread_stack_name, sizeof thread_stack_name, "rte:lcore:%s:%d:threadstack", rte_eal_process_type() == RTE_PROC_PRIMARY ? "p" : "s", i);
+		const struct rte_memzone *mz = rte_memzone_lookup(thread_stack_name);
+		if (mz == NULL) {
+			if ((mz = rte_memzone_reserve(thread_stack_name, thread_stack_size, lcore_config[i].socket_id, 0)) == NULL) {
+				rte_panic("Cannot allocate memzone for thread stack");
+			}
+		}
+		void *thread_stack = mz->addr;
+
+		if (pthread_attr_setstack(&attr, thread_stack, thread_stack_size) < 0) {
+			rte_panic("Cannot set thread stack\n");
+		}
+		RTE_LOG(DEBUG, EAL, "Thread stack for lcore %d on socket %d set to %p\n", i, lcore_config[i].socket_id, thread_stack);
+
 		/* create a thread for each lcore */
 		ret = pthread_create(&lcore_config[i].thread_id, NULL,
 				     eal_thread_loop, NULL);