[dpdk-dev] [PATCH v2] add one option memory-only for secondary processes

Bruce Richardson bruce.richardson at intel.com
Tue Dec 16 11:03:44 CET 2014


On Tue, Dec 16, 2014 at 09:26:48AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou) wrote:
> Hi, Bruce,
> How about this patch, can it be merged to master branch? Thanks.
> 
> Brgs,
> Chi Xiaobo
> 

At this point, I think we are well past code-freeze for new features for 1.8,
but this looks a good candidate for 2.0 once the merge window for that opens.

/Bruce

> 
> -----Original Message-----
> From: Chi, Xiaobo (NSN - CN/Hangzhou) 
> Sent: Monday, December 15, 2014 5:58 PM
> To: 'ext Hiroshi Shimamoto'; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> 
> Hi, Hiroshi,
> Yes, the should be performance degradation, not only due to the mempool cache, but also due to process scheduling overhead (lead by no CPU pin.)
> I have not done the performance testing. In my project scenarios, those SECONDARY processes only send/receive messages to/from the PRIMARY process via mempool/ring, the throughput is not so high, so the performance degradation is not critical to us. but there are dozens of SECONDARY processes in our system, it will be hard to manually properly pin them to different CPU cores, what we want is to apply linux standard scheduling mechanism to do load balance between CPU cores.
> 
> Brgs,
> Chi Xiaobo
> 
> 
> -----Original Message-----
> From: ext Hiroshi Shimamoto [mailto:h-shimamoto at ct.jp.nec.com] 
> Sent: Thursday, December 11, 2014 11:03 AM
> To: Chi, Xiaobo (NSN - CN/Hangzhou); dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> 
> Hi,
> 
> sorry for the delay.
> 
> > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > 
> > Hi, Hiroshi,
> > Yes, you are right, in order to avoid such problem, while create the mempool, which shall be shared between the primary
> > process and those secondary Processes, we need to assign the cache_size param value to be zero. And in order to make the
> > system more stable, it's better to define the RTE_MEMPOOL_CACHE_MAX_SIZE to be 0 in rte_config.h.
> 
> Yes, it prevents the data corruption, but it also hurts the performance.
> I think, if we use the mbuf w/o cache for PMD, we will see the performance degradation.
> 
> Don't you have any number?
> 
> thanks,
> Hiroshi
> 
> > 
> > /* create the mempool */
> > struct rte_mempool *
> > rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
> > 		   unsigned cache_size, unsigned private_data_size,
> > 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
> > 		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
> > 		   int socket_id, unsigned flags);
> > 
> > 
> > Brgs,
> > Chi xiaobo
> > 
> > 
> > -----Original Message-----
> > From: ext Hiroshi Shimamoto [mailto:h-shimamoto at ct.jp.nec.com]
> > Sent: Wednesday, December 03, 2014 6:54 PM
> > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > 
> > Hi,
> > 
> > > Subject: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > >
> > > From: Chi Xiaobo <xiaobo.chi at nsn.com>
> > >
> > > Problem: There is one normal DPDK processes deployment scenarios: one primary process and several (even hundreds) secondary
> > > processes; all outside packets/messages are sent/received by primary process and then distribute them to those secondary
> > > processes by DPDK's ring/sharedmemory mechanism. In such scenarios, those SECONDARY processes need only hugepage based
> > > sharememory mechanism and it?��s upper libs (such as ring, mempool, etc.), they need not cpu core pinning, iopl privilege
> > > changing , pci device, timer, alarm, interrupt, shared_driver_list,  core_info, threads for each core, etc. Then, for
> > > such kind of SECONDARY processes, the current rte_eal_init() is too heavy.
> > >
> > > Solution:One new EAL initializing argument, --memory-only, is added. It is only for those SECONDARY processes which
> > only
> > > want to share memory with other processes. if this argument is defined, users need not define those mandatory arguments,
> > > such as -c and -n, due to we don't want to pin such kind of processes to any CPUs.
> > 
> > however, we need the lcore_id per thread to use mempool.
> > If the lcore_id is not initialized, it must be 0, and multiple threads will break
> > mempool caches per thread, because of race condition.
> > We have to assign lcore_id per thread, these ids must not be overlapped, or disable
> > mempool handling in SECONDARY process.
> > 
> > thanks,
> > Hiroshi
> > 
> > > Signed-off-by: Chi Xiaobo <xiaobo.chi at nsn.com>
> > > ---
> > >  lib/librte_eal/common/eal_common_options.c | 17 ++++++++++++---
> > >  lib/librte_eal/common/eal_internal_cfg.h   |  1 +
> > >  lib/librte_eal/common/eal_options.h        |  2 ++
> > >  lib/librte_eal/linuxapp/eal/eal.c          | 34 +++++++++++++++++-------------
> > >  4 files changed, 36 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> > > index e2810ab..7b18498 100644
> > > --- a/lib/librte_eal/common/eal_common_options.c
> > > +++ b/lib/librte_eal/common/eal_common_options.c
> > > @@ -85,6 +85,7 @@ eal_long_options[] = {
> > >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> > >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> > >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > > +	{OPT_MEMORY_ONLY, 0, NULL, OPT_MEMORY_ONLY_NUM},
> > >  	{0, 0, 0, 0}
> > >  };
> > >
> > > @@ -126,6 +127,7 @@ eal_reset_internal_config(struct internal_config *internal_cfg)
> > >  	internal_cfg->no_hpet = 1;
> > >  #endif
> > >  	internal_cfg->vmware_tsc_map = 0;
> > > +	internal_cfg->memory_only= 0;
> > >  }
> > >
> > >  /*
> > > @@ -454,6 +456,10 @@ eal_parse_common_option(int opt, const char *optarg,
> > >  		conf->process_type = eal_parse_proc_type(optarg);
> > >  		break;
> > >
> > > +	case OPT_MEMORY_ONLY_NUM:
> > > +		conf->memory_only= 1;
> > > +		break;
> > > +
> > >  	case OPT_MASTER_LCORE_NUM:
> > >  		if (eal_parse_master_lcore(optarg) < 0) {
> > >  			RTE_LOG(ERR, EAL, "invalid parameter for --"
> > > @@ -525,9 +531,9 @@ eal_check_common_options(struct internal_config *internal_cfg)
> > >  {
> > >  	struct rte_config *cfg = rte_eal_get_configuration();
> > >
> > > -	if (!lcores_parsed) {
> > > -		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > > -			"-c or -l\n");
> > > +	if (!lcores_parsed && !(internal_cfg->process_type == RTE_PROC_SECONDARY&& internal_cfg->memory_only) ) {
> > > +		RTE_LOG(ERR, EAL, "For those processes without memory-only option, CPU cores "
> > > +							"must be enabled with options -c or -l\n");
> > >  		return -1;
> > >  	}
> > >  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > > @@ -545,6 +551,10 @@ eal_check_common_options(struct internal_config *internal_cfg)
> > >  			"specified\n");
> > >  		return -1;
> > >  	}
> > > +	if ( internal_cfg->process_type != RTE_PROC_SECONDARY && internal_cfg->memory_only ) {
> > > +		RTE_LOG(ERR, EAL, "only secondary processes can specify memory-only option.\n");
> > > +		return -1;
> > > +	}
> > >  	if (index(internal_cfg->hugefile_prefix, '%') != NULL) {
> > >  		RTE_LOG(ERR, EAL, "Invalid char, '%%', in --"OPT_FILE_PREFIX" "
> > >  			"option\n");
> > > @@ -590,6 +600,7 @@ eal_common_usage(void)
> > >  	       "  --"OPT_SYSLOG"     : set syslog facility\n"
> > >  	       "  --"OPT_LOG_LEVEL"  : set default log level\n"
> > >  	       "  --"OPT_PROC_TYPE"  : type of this process\n"
> > > +	       "  --"OPT_MEMORY_ONLY": only use shared memory, valid only for secondary process.\n"
> > >  	       "  --"OPT_PCI_BLACKLIST", -b: add a PCI device in black list.\n"
> > >  	       "               Prevent EAL from using this PCI device. The argument\n"
> > >  	       "               format is <domain:bus:devid.func>.\n"
> > > diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
> > > index aac6abf..f51f0a2 100644
> > > --- a/lib/librte_eal/common/eal_internal_cfg.h
> > > +++ b/lib/librte_eal/common/eal_internal_cfg.h
> > > @@ -85,6 +85,7 @@ struct internal_config {
> > >
> > >  	unsigned num_hugepage_sizes;      /**< how many sizes on this system */
> > >  	struct hugepage_info hugepage_info[MAX_HUGEPAGE_SIZES];
> > > +	volatile unsigned memory_only;    /**<wheter the seconday process only need shared momory only or not */
> > >  };
> > >  extern struct internal_config internal_config; /**< Global EAL configuration. */
> > >
> > > diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> > > index e476f8d..87cc5db 100644
> > > --- a/lib/librte_eal/common/eal_options.h
> > > +++ b/lib/librte_eal/common/eal_options.h
> > > @@ -77,6 +77,8 @@ enum {
> > >  	OPT_CREATE_UIO_DEV_NUM,
> > >  #define OPT_VFIO_INTR    "vfio-intr"
> > >  	OPT_VFIO_INTR_NUM,
> > > +#define OPT_MEMORY_ONLY  "memory-only"
> > > +	OPT_MEMORY_ONLY_NUM,
> > >  	OPT_LONG_MAX_NUM
> > >  };
> > >
> > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> > > index 89f3b5e..c160771 100644
> > > --- a/lib/librte_eal/linuxapp/eal/eal.c
> > > +++ b/lib/librte_eal/linuxapp/eal/eal.c
> > > @@ -752,14 +752,6 @@ rte_eal_init(int argc, char **argv)
> > >
> > >  	rte_config_init();
> > >
> > > -	if (rte_eal_pci_init() < 0)
> > > -		rte_panic("Cannot init PCI\n");
> > > -
> > > -#ifdef RTE_LIBRTE_IVSHMEM
> > > -	if (rte_eal_ivshmem_init() < 0)
> > > -		rte_panic("Cannot init IVSHMEM\n");
> > > -#endif
> > > -
> > >  	if (rte_eal_memory_init() < 0)
> > >  		rte_panic("Cannot init memory\n");
> > >
> > > @@ -772,14 +764,30 @@ rte_eal_init(int argc, char **argv)
> > >  	if (rte_eal_tailqs_init() < 0)
> > >  		rte_panic("Cannot init tail queues for objects\n");
> > >
> > > +	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
> > > +		rte_panic("Cannot init logs\n");
> > > +
> > > +	eal_check_mem_on_local_socket();
> > > +
> > > +	rte_eal_mcfg_complete();
> > > +
> > > +    /*with memory-only option, we need not cpu affinity, pci device, alarm, external devices, interrupt, etc. */
> > > +	if( internal_config.memory_only ){
> > > +		RTE_LOG (DEBUG, EAL, "memory-only defined, so only memory being initialized.\n");
> > > +		return 0;
> > > +	}
> > > +
> > > +	if (rte_eal_pci_init() < 0)
> > > +		rte_panic("Cannot init PCI\n");
> > > +
> > >  #ifdef RTE_LIBRTE_IVSHMEM
> > > +	if (rte_eal_ivshmem_init() < 0)
> > > +		rte_panic("Cannot init IVSHMEM\n");
> > > +
> > >  	if (rte_eal_ivshmem_obj_init() < 0)
> > >  		rte_panic("Cannot init IVSHMEM objects\n");
> > >  #endif
> > >
> > > -	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
> > > -		rte_panic("Cannot init logs\n");
> > > -
> > >  	if (rte_eal_alarm_init() < 0)
> > >  		rte_panic("Cannot init interrupt-handling thread\n");
> > >
> > > @@ -789,10 +797,6 @@ rte_eal_init(int argc, char **argv)
> > >  	if (rte_eal_timer_init() < 0)
> > >  		rte_panic("Cannot init HPET or TSC timers\n");
> > >
> > > -	eal_check_mem_on_local_socket();
> > > -
> > > -	rte_eal_mcfg_complete();
> > > -
> > >  	TAILQ_FOREACH(solib, &solib_list, next) {
> > >  		RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> > >  		solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> > > --
> > > 1.9.4.msysgit.2
> 


More information about the dev mailing list