[dpdk-dev] [PATCH v2] kni: add new mbuf in alloc_q only based on its empty slots

gowrishankar muthukrishnan gowrishankar.m at linux.vnet.ibm.com
Thu May 18 19:45:37 CEST 2017


On Tuesday 16 May 2017 10:45 PM, Ferruh Yigit wrote:
> On 5/11/2017 12:51 PM, Gowrishankar wrote:
>> From: Gowrishankar Muthukrishnan <gowrishankar.m at linux.vnet.ibm.com>
>>
>> In kni_allocate_mbufs(), we attempt to add max_burst (32) count of mbuf
>> always into alloc_q, which is excessively leading too many rte_pktmbuf_
>> free() when alloc_q is contending at high packet rate (for eg 10Gig data).
>> In a situation when alloc_q fifo can only accommodate very few (or zero)
>> mbuf, create only what needed and add in fifo.
> I remember I have tried similar, also tried allocating amount of
> nb_packets read from kernel, both produced worse performance.
> Can you please share your before/after performance numbers?
Sure Ferruh, please find below comparison of call counts I set at two places
along with additional stat on kni egress for more than one packet in txq 
burst read,
as in pseudo code below:

    @@ -589,8 +592,12 @@ rte_kni_rx_burst(struct rte_kni *kni, struct
    rte_mbuf **mbufs, unsigned num)
             unsigned ret = kni_fifo_get(kni->tx_q, (void **)mbufs, num);

             /* If buffers removed, allocate mbufs and then put them
    into alloc_q */
            if (ret) {
                    ++alloc_call;
                    if (ret > 1)
                            alloc_call_mt1tx += ret;
                     kni_allocate_mbufs(kni);
            }

             return ret;
      }
    @@ -659,6 +666,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
             if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
                     int j;

                    freembuf_call += (i-ret);
                    for (j = ret; j < i; j++)
                        rte_pktmbuf_free(pkts[j]);



> kni_allocate_mbufs() called within rte_kni_rx_burst() if any packet
> received from kernel. If there is a heavy traffic, kernel will always
> consume the alloc_q before this function called and this function will
> fill it back. So there shouldn't be much cases that alloc_q fifo already
> full.
> Perhaps this can happen if application burst Rx from kernel in a number
> less than 32, but fifo filled with fixed 32mbufs, is this your case?

I think some resemblance to this case based on below stats. W/o patch, 
application
would spend its most of processing in freeing mbufs right ?.

>
> Can you measure number of times rte_pktmbuf_free() called because of
> alloc_q is full?

I have sampled below data in x86_64 for KNI on ixgbe pmd. iperf server 
runs on
remote interface connecting PMD and iperf client runs on KNI interface, 
so as to
create more egress from KNI into DPDK (w/o and with this patch) for 1MB and
100MB data. rx and tx stats are from kni app (USR1).

100MB w/o patch 1.28Gbps
rx      tx        alloc_call  alloc_call_mt1tx freembuf_call
3933 72464 51042      42472              1560540

1MB w/o patch 204Mbps
rx   tx       alloc_call alloc_call_mt1tx freembuf_call
84  734    566        330                   17378

100MB w/ patch 1.23Gbps
rx      tx        alloc_call  alloc_call_mt1tx freembuf_call
4258 72466 72466      0                      0

1MB w/ patch 203Mbps
rx  tx       alloc_call alloc_call_mt1tx freembuf_call
76 734    733        2                       0

With patch, KNI egress on txq seems to be almost only one packet at a time
(and in 1MB test, a rare instance of more than 2 packets seen even 
though it is
burst read). Also, as it is one mbuf consumed by module and added by lib at
a time, rte_pktmbuf_free is not called at all, due to right amount (1 or 2)
of mbufs enqueued in alloc_q.

This controlled enqueue on alloc_q avoids nw stall for i40e in ppc64le. 
Could you
please check if i40e is able to handle data at order of 10GiB in your 
arch, as I see
that, network stalls at some random point w/o this patch.

Thanks,
Gowrishankar

>> With this patch, we could stop random network stall in KNI at higher packet
>> rate (eg 1G or 10G data between vEth0 and PMD) sufficiently exhausting
>> alloc_q on above condition. I tested i40e PMD for this purpose in ppc64le.
> If stall happens from NIC to kernel, this is kernel receive path, and
> alloc_q is in kernel transmit path.
>
>> Changes:
>>   v2 - alloc_q free count calculation corrected.
>>        line wrap fixed for commit message.
>>
>> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m at linux.vnet.ibm.com>
>> ---
>>   lib/librte_kni/rte_kni.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
>> index c3f9208..9c5d485 100644
>> --- a/lib/librte_kni/rte_kni.c
>> +++ b/lib/librte_kni/rte_kni.c
>> @@ -624,6 +624,7 @@ struct rte_kni *
>>   	int i, ret;
>>   	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
>>   	void *phys[MAX_MBUF_BURST_NUM];
>> +	int allocq_free;
>>   
>>   	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
>>   			 offsetof(struct rte_kni_mbuf, pool));
>> @@ -646,7 +647,9 @@ struct rte_kni *
>>   		return;
>>   	}
>>   
>> -	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
>> +	allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \
>> +			& (MAX_MBUF_BURST_NUM - 1);
>> +	for (i = 0; i < allocq_free; i++) {
>>   		pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
>>   		if (unlikely(pkts[i] == NULL)) {
>>   			/* Out of memory */
>>




More information about the dev mailing list