[dpdk-dev] A question about hugepage initialization time

Neil Horman nhorman at tuxdriver.com
Wed Dec 10 15:29:26 CET 2014


On Wed, Dec 10, 2014 at 10:32:25AM +0000, Bruce Richardson wrote:
> On Tue, Dec 09, 2014 at 02:10:32PM -0800, Stephen Hemminger wrote:
> > On Tue, 9 Dec 2014 11:45:07 -0800
> > &rew <andras.kovacs at ericsson.com> wrote:
> > 
> > > > Hey Folks,
> > > >
> > > > Our DPDK application deals with very large in memory data structures, and
> > > > can potentially use tens or even hundreds of gigabytes of hugepage memory.
> > > > During the course of development, we've noticed that as the number of huge
> > > > pages increases, the memory initialization time during EAL init gets to be
> > > > quite long, lasting several minutes at present.  The growth in init time
> > > > doesn't appear to be linear, which is concerning.
> > > >
> > > > This is a minor inconvenience for us and our customers, as memory
> > > > initialization makes our boot times a lot longer than it would otherwise
> > > > be.  Also, my experience has been that really long operations often are
> > > > hiding errors - what you think is merely a slow operation is actually a
> > > > timeout of some sort, often due to misconfiguration. This leads to two
> > > > questions:
> > > >
> > > > 1. Does the long initialization time suggest that there's an error
> > > > happening under the covers?
> > > > 2. If not, is there any simple way that we can shorten memory
> > > > initialization time?
> > > >
> > > > Thanks in advance for your insights.
> > > >
> > > > --
> > > > Matt Laswell
> > > > laswell at infiniteio.com
> > > > infinite io, inc.
> > > >
> > > 
> > > Hello,
> > > 
> > > please find some quick comments on the questions:
> > > 1.) By our experience long initialization time is normal in case of 
> > > large amount of memory. However this time depends on some things:
> > > - number of hugepages (pagefault handled by kernel is pretty expensive)
> > > - size of hugepages (memset at initialization)
> > > 
> > > 2.) Using 1G pages instead of 2M will reduce the initialization time 
> > > significantly. Using wmemset instead of memset adds an additional 20-30% 
> > > boost by our measurements. Or, just by touching the pages but not cleaning 
> > > them you can have still some more speedup. But in this case your layer or 
> > > the applications above need to do the cleanup at allocation time 
> > > (e.g. by using rte_zmalloc).
> > > 
> > > Cheers,
> > > &rew
> > 
> > I wonder if the whole rte_malloc code is even worth it with a modern kernel
> > with transparent huge pages? rte_malloc adds very little value and is less safe
> > and slower than glibc or other allocators. Plus you lose the ablilty to get
> > all the benefit out of valgrind or electric fence.
> 
> While I'd dearly love to not have our own custom malloc lib to maintain, for DPDK
> multiprocess, rte_malloc will be hard to replace as we would need a replacement
> solution that similarly guarantees that memory mapped in process A is also 
> available at the same address in process B. :-(
> 
Just out of curiosity, why even bother with multiprocess support?  What you're
talking about above is a multithread model, and your shoehorning multiple
processes into it.
Neil

> /Bruce
> 


More information about the dev mailing list