[dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598

Avi Kivity avi at cloudius-systems.com
Tue Aug 25 19:39:28 CEST 2015


On 08/25/2015 08:33 PM, Ananyev, Konstantin wrote:
> Hi Vlad,
>
>> -----Original Message-----
>> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>> Sent: Thursday, August 20, 2015 10:07 AM
>> To: Ananyev, Konstantin; Lu, Wenzhuo
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh above 1 for all NICs but 82598
>>
>>
>>
>> On 08/20/15 12:05, Vlad Zolotarov wrote:
>>>
>>> On 08/20/15 11:56, Vlad Zolotarov wrote:
>>>>
>>>> On 08/20/15 11:41, Ananyev, Konstantin wrote:
>>>>> Hi Vlad,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>>>>>> Sent: Wednesday, August 19, 2015 11:03 AM
>>>>>> To: Ananyev, Konstantin; Lu, Wenzhuo
>>>>>> Cc: dev at dpdk.org
>>>>>> Subject: Re: [dpdk-dev] [PATCH v1] ixgbe_pmd: forbid tx_rs_thresh
>>>>>> above 1 for all NICs but 82598
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 08/19/15 10:43, Ananyev, Konstantin wrote:
>>>>>>> Hi Vlad,
>>>>>>> Sorry for delay with review, I am OOO till next week.
>>>>>>> Meanwhile, few questions/comments from me.
>>>>>> Hi, Konstantin, long time no see... ;)
>>>>>>
>>>>>>>>>>>> This patch fixes the Tx hang we were constantly hitting with a
>>>>>>>> seastar-based
>>>>>>>>>>>> application on x540 NIC.
>>>>>>>>>>> Could you help to share with us how to reproduce the tx hang
>>>>>>>>>>> issue,
>>>>>>>> with using
>>>>>>>>>>> typical DPDK examples?
>>>>>>>>>> Sorry. I'm not very familiar with the typical DPDK examples to
>>>>>>>>>> help u
>>>>>>>>>> here. However this is quite irrelevant since without this this
>>>>>>>>>> patch
>>>>>>>>>> ixgbe PMD obviously abuses the HW spec as has been explained
>>>>>>>>>> above.
>>>>>>>>>>
>>>>>>>>>> We saw the issue when u stressed the xmit path with a lot of
>>>>>>>>>> highly
>>>>>>>>>> fragmented TCP frames (packets with up to 33 fragments with
>>>>>>>>>> non-headers
>>>>>>>>>> fragments as small as 4 bytes) with all offload features enabled.
>>>>>>> Could you provide us with the pcap file to reproduce the issue?
>>>>>> Well, the thing is it takes some time to reproduce it (a few
>>>>>> minutes of
>>>>>> heavy load) therefore a pcap would be quite large.
>>>>> Probably you can upload it to some place, from which we will be able
>>>>> to download it?
>>>> I'll see what I can do but no promises...
>>> On a second thought pcap file won't help u much since in order to
>>> reproduce the issue u have to reproduce exactly the same structure of
>>> clusters i give to HW and it's not what u see on wire in a TSO case.
>> And not only in a TSO case... ;)
> I understand that, but my thought was you can add some sort of TX callback for the rte_eth_tx_burst()
> into your code that would write the packet into pcap file and then re-run your hang scenario.
> I know that it means extra work for you - but I think it would be very helpful if we would be able to reproduce your hang scenario:
> - if HW guys would confirm that setting RS bit for every EOP packet is not really required,
>    then we probably have to look at what else can cause it.
> - it might be added to our validation cycle, to prevent hitting similar problem in future.
> Thanks
> Konstantin
>


I think if you send packets with random fragment chains up to 32 mbufs 
you might see this.  TSO was not required to trigger this problem.



More information about the dev mailing list