[dpdk-dev] [PATCH v2] kni: add new mbuf in alloc_q only based on its empty slots

Ferruh Yigit ferruh.yigit at intel.com
Wed May 31 18:21:19 CEST 2017


Hi Gowrishankar,

Sorry for late response.

On 5/18/2017 6:45 PM, gowrishankar muthukrishnan wrote:
> On Tuesday 16 May 2017 10:45 PM, Ferruh Yigit wrote:
>> On 5/11/2017 12:51 PM, Gowrishankar wrote:
>>> From: Gowrishankar Muthukrishnan <gowrishankar.m at linux.vnet.ibm.com>
>>>
>>> In kni_allocate_mbufs(), we attempt to add max_burst (32) count of mbuf
>>> always into alloc_q, which is excessively leading too many rte_pktmbuf_
>>> free() when alloc_q is contending at high packet rate (for eg 10Gig data).
>>> In a situation when alloc_q fifo can only accommodate very few (or zero)
>>> mbuf, create only what needed and add in fifo.
>> I remember I have tried similar, also tried allocating amount of
>> nb_packets read from kernel, both produced worse performance.
>> Can you please share your before/after performance numbers?
> Sure Ferruh, please find below comparison of call counts I set at two places
> along with additional stat on kni egress for more than one packet in txq
> burst read,
> as in pseudo code below:
> 
>     @@ -589,8 +592,12 @@ rte_kni_rx_burst(struct rte_kni *kni, struct
>     rte_mbuf **mbufs, unsigned num)
>             unsigned ret = kni_fifo_get(kni->tx_q, (void **)mbufs, num);
>      
>             /* If buffers removed, allocate mbufs and then put them into
>     alloc_q */
>            if (ret) {
>                    ++alloc_call;
>                    if (ret > 1)
>                            alloc_call_mt1tx += ret;
>                     kni_allocate_mbufs(kni);
>            }
>      
>             return ret;
>      }
>     @@ -659,6 +666,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
>             if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
>                     int j;
>      
>                    freembuf_call += (i-ret);
>                    for (j = ret; j < i; j++)
>                        rte_pktmbuf_free(pkts[j]);
> 
> 
> 
>> kni_allocate_mbufs() called within rte_kni_rx_burst() if any packet
>> received from kernel. If there is a heavy traffic, kernel will always
>> consume the alloc_q before this function called and this function will
>> fill it back. So there shouldn't be much cases that alloc_q fifo already
>> full.
>> Perhaps this can happen if application burst Rx from kernel in a number
>> less than 32, but fifo filled with fixed 32mbufs, is this your case?
> 
> I think some resemblance to this case based on below stats. W/o patch,
> application
> would spend its most of processing in freeing mbufs right ?.
> 
>>
>> Can you measure number of times rte_pktmbuf_free() called because of
>> alloc_q is full?
> 
> I have sampled below data in x86_64 for KNI on ixgbe pmd. iperf server
> runs on
> remote interface connecting PMD and iperf client runs on KNI interface,
> so as to
> create more egress from KNI into DPDK (w/o and with this patch) for 1MB and
> 100MB data. rx and tx stats are from kni app (USR1).
> 
> 100MB w/o patch 1.28Gbps
> rx      tx        alloc_call  alloc_call_mt1tx freembuf_call
> 3933 72464 51042      42472              1560540

Some math:

alloc called 51042 times with allocating 32 mbufs each time,
51042 * 32 = 1633344

freed mbufs: 1560540

used mbufs: 1633344 - 1560540 = 72804

72804 =~ 72464, so looks correct.

Which means rte_kni_rx_burst() called 51042 times and 72464 buffers
received.

As you already mentioned, for each call kernel able to put only 1-2
packets into the fifo. This number is close to 3 for my test with KNI PMD.

And for this case, agree your patch looks reasonable.

But what if kni has more egress traffic, that able to put >= 32 packets
between each rte_kni_rx_burst()?
For that case this patch introduces extra cost to get allocq_free count.

Overall I am not disagree with patch, but I have concern if this would
cause performance loss some cases while making better for this one. That
would help a lot if KNI users test and comment.

For me, applying patch didn't give any difference in final performance
numbers, but if there is no objection, I am OK to get this patch.


> 
> 1MB w/o patch 204Mbps
> rx   tx       alloc_call alloc_call_mt1tx freembuf_call
> 84  734    566        330                   17378
> 
> 100MB w/ patch 1.23Gbps
> rx      tx        alloc_call  alloc_call_mt1tx freembuf_call
> 4258 72466 72466      0                      0
> 
> 1MB w/ patch 203Mbps
> rx  tx       alloc_call alloc_call_mt1tx freembuf_call
> 76 734    733        2                       0
> 
> With patch, KNI egress on txq seems to be almost only one packet at a time
> (and in 1MB test, a rare instance of more than 2 packets seen even
> though it is
> burst read). Also, as it is one mbuf consumed by module and added by lib at
> a time, rte_pktmbuf_free is not called at all, due to right amount (1 or 2)
> of mbufs enqueued in alloc_q.
> 
> This controlled enqueue on alloc_q avoids nw stall for i40e in ppc64le.
> Could you
> please check if i40e is able to handle data at order of 10GiB in your
> arch, as I see
> that, network stalls at some random point w/o this patch.
> 
> Thanks,
> Gowrishankar
> 
>>> With this patch, we could stop random network stall in KNI at higher packet
>>> rate (eg 1G or 10G data between vEth0 and PMD) sufficiently exhausting
>>> alloc_q on above condition. I tested i40e PMD for this purpose in ppc64le.
>> If stall happens from NIC to kernel, this is kernel receive path, and
>> alloc_q is in kernel transmit path.
>>
>>> Changes:
>>>  v2 - alloc_q free count calculation corrected.
>>>       line wrap fixed for commit message.
>>>
>>> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m at linux.vnet.ibm.com>
>>> ---
>>>  lib/librte_kni/rte_kni.c | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
>>> index c3f9208..9c5d485 100644
>>> --- a/lib/librte_kni/rte_kni.c
>>> +++ b/lib/librte_kni/rte_kni.c
>>> @@ -624,6 +624,7 @@ struct rte_kni *
>>>  	int i, ret;
>>>  	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
>>>  	void *phys[MAX_MBUF_BURST_NUM];
>>> +	int allocq_free;
>>>  
>>>  	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
>>>  			 offsetof(struct rte_kni_mbuf, pool));
>>> @@ -646,7 +647,9 @@ struct rte_kni *
>>>  		return;
>>>  	}
>>>  
>>> -	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
>>> +	allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \
>>> +			& (MAX_MBUF_BURST_NUM - 1);
>>> +	for (i = 0; i < allocq_free; i++) {
>>>  		pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
>>>  		if (unlikely(pkts[i] == NULL)) {
>>>  			/* Out of memory */
>>>
> 
> 



More information about the dev mailing list