[dpdk-users] Huge pages to be allocated based on number of mbufs

John Boyle jboyle at purestorage.com
Tue Mar 15 02:47:29 CET 2016


Hi Saurabh,

I don't know all the details of your setup, but I'm guessing that you may
have run into the hugepage fragmentation issue.

Try calling rte_malloc_dump_stats(stdout, "dummy") right before the mempool
creation.  Output might look like this:

Socket:0
Heap_size:2147472192,
Free_size:2047523584,
Alloc_size:99948608,
Greatest_free_size:130023360,
Alloc_count:82,
Free_count:179,

(That would be after a successful allocation of a ~99MB mbuf pool.)

The mbuf_pool gets allocated with a single giant call to the internal
malloc_heap_alloc function.  If the "Greatest_free_size" is smaller than
the mbuf_pool you're trying to create, then the alloc will fail.  Now, if
the total free size is smaller or is not much larger than what you're
trying to allocate, then you would be advised to give it more hugepages.

On the other hand, if the total "Free_size" is much larger than what you
need, but the "Greatest_free_size" is considerably smaller (in the above
example, the largest free slab is 130 MB despite nearly 2GB being
available), then you have a considerably fragmented heap.

How do you get a fragmented heap during the initialization phase of the
program?  The heap is created by mmapping a bunch of hugepages, noticing
which ones happen to have adjacent physical addresses, and then the
contiguous chunks become the separate available slabs in the heap.  If the
system has just been booted, then you are likely to end up with a nice
large slab into which you can fit a huge mbuf_pool.  If the system's been
running for a while, it's more likely to be fragmented, in which case you
may get something like the example I pasted above.

At Pure Storage, we ended up solving this by reserving a single 1GB
hugepage, which can't be fragmented.

-- John Boyle
*Science is what we understand well enough to explain to a computer. Art is
everything else we do.* --Knuth

On Mon, Mar 14, 2016 at 10:54 AM, Saurabh Mishra <saurabh.globe at gmail.com>
wrote:

> Hi,
>
> We are planning to support virtio, vmxnet3, ixgbe, i40e, bxn2x and SR-IOV
> on some of them with DPDK.
>
> We have seen that even if we give correct number of mbufs given the number
> hugepages reserved, rte_eth_tx_queue_setup() may still fail with no enough
> memory (I saw this on i40evf but worked on virtio and vmxnet3).
>
> We like to know what's the recommended way to determine how many hugepages
> we should allocate given the number of mbufs such that queue setup APIs
> also don't fail.
>
> Since we will be running on low-end systems too we need to be careful about
> reserving hugepages.
>
> Thanks,
> /Saurabh
>


More information about the users mailing list