[dpdk-dev] [PATCH v2] kni: add new mbuf in alloc_q only based on its empty slots
Ferruh Yigit
ferruh.yigit at intel.com
Wed May 31 18:21:19 CEST 2017
Hi Gowrishankar,
Sorry for late response.
On 5/18/2017 6:45 PM, gowrishankar muthukrishnan wrote:
> On Tuesday 16 May 2017 10:45 PM, Ferruh Yigit wrote:
>> On 5/11/2017 12:51 PM, Gowrishankar wrote:
>>> From: Gowrishankar Muthukrishnan <gowrishankar.m at linux.vnet.ibm.com>
>>>
>>> In kni_allocate_mbufs(), we attempt to add max_burst (32) count of mbuf
>>> always into alloc_q, which is excessively leading too many rte_pktmbuf_
>>> free() when alloc_q is contending at high packet rate (for eg 10Gig data).
>>> In a situation when alloc_q fifo can only accommodate very few (or zero)
>>> mbuf, create only what needed and add in fifo.
>> I remember I have tried similar, also tried allocating amount of
>> nb_packets read from kernel, both produced worse performance.
>> Can you please share your before/after performance numbers?
> Sure Ferruh, please find below comparison of call counts I set at two places
> along with additional stat on kni egress for more than one packet in txq
> burst read,
> as in pseudo code below:
>
> @@ -589,8 +592,12 @@ rte_kni_rx_burst(struct rte_kni *kni, struct
> rte_mbuf **mbufs, unsigned num)
> unsigned ret = kni_fifo_get(kni->tx_q, (void **)mbufs, num);
>
> /* If buffers removed, allocate mbufs and then put them into
> alloc_q */
> if (ret) {
> ++alloc_call;
> if (ret > 1)
> alloc_call_mt1tx += ret;
> kni_allocate_mbufs(kni);
> }
>
> return ret;
> }
> @@ -659,6 +666,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
> if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
> int j;
>
> freembuf_call += (i-ret);
> for (j = ret; j < i; j++)
> rte_pktmbuf_free(pkts[j]);
>
>
>
>> kni_allocate_mbufs() called within rte_kni_rx_burst() if any packet
>> received from kernel. If there is a heavy traffic, kernel will always
>> consume the alloc_q before this function called and this function will
>> fill it back. So there shouldn't be much cases that alloc_q fifo already
>> full.
>> Perhaps this can happen if application burst Rx from kernel in a number
>> less than 32, but fifo filled with fixed 32mbufs, is this your case?
>
> I think some resemblance to this case based on below stats. W/o patch,
> application
> would spend its most of processing in freeing mbufs right ?.
>
>>
>> Can you measure number of times rte_pktmbuf_free() called because of
>> alloc_q is full?
>
> I have sampled below data in x86_64 for KNI on ixgbe pmd. iperf server
> runs on
> remote interface connecting PMD and iperf client runs on KNI interface,
> so as to
> create more egress from KNI into DPDK (w/o and with this patch) for 1MB and
> 100MB data. rx and tx stats are from kni app (USR1).
>
> 100MB w/o patch 1.28Gbps
> rx tx alloc_call alloc_call_mt1tx freembuf_call
> 3933 72464 51042 42472 1560540
Some math:
alloc called 51042 times with allocating 32 mbufs each time,
51042 * 32 = 1633344
freed mbufs: 1560540
used mbufs: 1633344 - 1560540 = 72804
72804 =~ 72464, so looks correct.
Which means rte_kni_rx_burst() called 51042 times and 72464 buffers
received.
As you already mentioned, for each call kernel able to put only 1-2
packets into the fifo. This number is close to 3 for my test with KNI PMD.
And for this case, agree your patch looks reasonable.
But what if kni has more egress traffic, that able to put >= 32 packets
between each rte_kni_rx_burst()?
For that case this patch introduces extra cost to get allocq_free count.
Overall I am not disagree with patch, but I have concern if this would
cause performance loss some cases while making better for this one. That
would help a lot if KNI users test and comment.
For me, applying patch didn't give any difference in final performance
numbers, but if there is no objection, I am OK to get this patch.
>
> 1MB w/o patch 204Mbps
> rx tx alloc_call alloc_call_mt1tx freembuf_call
> 84 734 566 330 17378
>
> 100MB w/ patch 1.23Gbps
> rx tx alloc_call alloc_call_mt1tx freembuf_call
> 4258 72466 72466 0 0
>
> 1MB w/ patch 203Mbps
> rx tx alloc_call alloc_call_mt1tx freembuf_call
> 76 734 733 2 0
>
> With patch, KNI egress on txq seems to be almost only one packet at a time
> (and in 1MB test, a rare instance of more than 2 packets seen even
> though it is
> burst read). Also, as it is one mbuf consumed by module and added by lib at
> a time, rte_pktmbuf_free is not called at all, due to right amount (1 or 2)
> of mbufs enqueued in alloc_q.
>
> This controlled enqueue on alloc_q avoids nw stall for i40e in ppc64le.
> Could you
> please check if i40e is able to handle data at order of 10GiB in your
> arch, as I see
> that, network stalls at some random point w/o this patch.
>
> Thanks,
> Gowrishankar
>
>>> With this patch, we could stop random network stall in KNI at higher packet
>>> rate (eg 1G or 10G data between vEth0 and PMD) sufficiently exhausting
>>> alloc_q on above condition. I tested i40e PMD for this purpose in ppc64le.
>> If stall happens from NIC to kernel, this is kernel receive path, and
>> alloc_q is in kernel transmit path.
>>
>>> Changes:
>>> v2 - alloc_q free count calculation corrected.
>>> line wrap fixed for commit message.
>>>
>>> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m at linux.vnet.ibm.com>
>>> ---
>>> lib/librte_kni/rte_kni.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
>>> index c3f9208..9c5d485 100644
>>> --- a/lib/librte_kni/rte_kni.c
>>> +++ b/lib/librte_kni/rte_kni.c
>>> @@ -624,6 +624,7 @@ struct rte_kni *
>>> int i, ret;
>>> struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
>>> void *phys[MAX_MBUF_BURST_NUM];
>>> + int allocq_free;
>>>
>>> RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
>>> offsetof(struct rte_kni_mbuf, pool));
>>> @@ -646,7 +647,9 @@ struct rte_kni *
>>> return;
>>> }
>>>
>>> - for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
>>> + allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \
>>> + & (MAX_MBUF_BURST_NUM - 1);
>>> + for (i = 0; i < allocq_free; i++) {
>>> pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
>>> if (unlikely(pkts[i] == NULL)) {
>>> /* Out of memory */
>>>
>
>
More information about the dev
mailing list