[dpdk-dev] rte_pktmbuf_clone returning NULL while mempool shouldn't be full

Marinus, Dennis dmarinus at amazon.com
Tue Jul 14 01:41:23 CEST 2015


Hey,

I'm having some trouble calculating the right mempool & ring sizes for my application. I'm getting mempool full errors even though that shouldn't be possible, so I'm missing something. Can you help me figure out if my math is correct?

My application is using a nic -> rx_core -> worker_core -> tx_core -> nic model, with a packet sample core on the side. There are rings between the worker cores and this pcap core, and the worker cores do an rte_pktmbuf_clone when they want to sample an mbuf. The pcap core is responsible for sending these mbufs out of kni interfaces.

Here are some numbers:
number of nics: 2
nic rx queue: 1024

number of rx cores: 2
rx rte_eth_rx_burst size: 64

number of rings between rx_core & worker_core: 20
ring size: 1024

number of worker cores: 10
worker_core rte_ring_sc_dequeue_burst size: 128

number of rings between worker_core & tx_core: 20
ring size: 512

number of tx cores: 2
tx_code rte_ring_sc_dequeue_burst: 64

number of nics: 2
nic tx queue: 512

number of pcap cores: 1
number of rings between worker_core & pcap_core: 20
ring size: 256
pcap core rte_ring_sc_dequeue_burst: 64

per lcore cache is disabled when creating the mempool.

So adding this all up (working backwards from tx to rx) gives me:

2 full nic tx queues: 1024
2 full local dequeue buffers in the tx cores: 128
20 full rings between worker & tx: 10240
10 full local dequeue buffers in the worker cores: 1280
20 full rings between rx & worker: 20480
2 full local dequeue buffers in the rx cores: 128
2 full nic rx queues: 2048

1 full local dequeue buffer in the pcap core: 64
20 full rings between the worker & pcap core: 5120

Adding this all up gets me to 40512 elements. My mempool is created with ((1 << 16) - 1) or 65535 elements. I only have one mempool and the application is restricted to a single numa node.

My mempool element size is set to something like 2K + headroom and we don't have jumbo packets on the network, so each mbuf should only require one element from the mempool.

What I'm seeing is that rte_pktmbuf_clone on the worker cores returns NULL sometimes, and if I immediately after that print an rte_mempool_count I get numbers back in the single or double digits.

Where are the rest of the free elements? According to my math I should have in the order of 25K free elements in the mempool. What am I missing? What is the recommended usage of a mempool?

- Dennis


More information about the dev mailing list