[dpdk-dev] [PATCH] ixgbe: prefetch packet headers in vector PMD receive function

Zoltan Kiss zoltan.kiss at linaro.org
Thu Oct 15 12:32:44 CEST 2015



On 15/10/15 00:23, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>> Sent: Wednesday, October 14, 2015 5:11 PM
>> To: Ananyev, Konstantin; Richardson, Bruce; dev at dpdk.org
>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>
>>
>>
>> On 28/09/15 00:19, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>>>> Sent: Friday, September 25, 2015 7:29 PM
>>>> To: Richardson, Bruce; dev at dpdk.org
>>>> Cc: Ananyev, Konstantin
>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function
>>>>
>>>> On 07/09/15 07:41, Richardson, Bruce wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>>>>>> Sent: Monday, September 7, 2015 3:15 PM
>>>>>> To: Richardson, Bruce; dev at dpdk.org
>>>>>> Cc: Ananyev, Konstantin
>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive
>>>>>> function
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 07/09/15 13:57, Richardson, Bruce wrote:
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>>>>>>>> Sent: Monday, September 7, 2015 1:26 PM
>>>>>>>> To: dev at dpdk.org
>>>>>>>> Cc: Ananyev, Konstantin; Richardson, Bruce
>>>>>>>> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD
>>>>>>>> receive function
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I just realized I've missed the "[PATCH]" tag from the subject. Did
>>>>>>>> anyone had time to review this?
>>>>>>>>
>>>>>>> Hi Zoltan,
>>>>>>>
>>>>>>> the big thing that concerns me with this is the addition of new
>>>>>>> instructions for each packet in the fast path. Ideally, this
>>>>>>> prefetching would be better handled in the application itself, as for
>>>>>>> some apps, e.g. those using pipelining, the core doing the RX from the
>>>>>>> NIC may not touch the packet data at all, and the prefetches will
>>>>>> instead cause a performance slowdown.
>>>>>>> Is it possible to get the same performance increase - or something
>>>>>>> close to it - by making changes in OVS?
>>>>>> OVS already does a prefetch when it's processing the previous packet, but
>>>>>> apparently it's not early enough. At least for my test scenario, where I'm
>>>>>> forwarding UDP packets with the least possible overhead. I guess in tests
>>>>>> where OVS does more complex processing it should be fine.
>>>>>> I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if
>>>>>> it'll help.
>>>>> I would suggest trying to prefetch more than one packet ahead. Prefetching 4 or
>>>>> 8 ahead might work better, depending on the processing being done.
>>>>
>>>> I've moved the prefetch earlier, and it seems to work:
>>>>
>>>> https://patchwork.ozlabs.org/patch/519017/
>>>>
>>>> However it raises the question: should we remove header prefetch from
>>>> all the other drivers, and the CONFIG_RTE_PMD_PACKET_PREFETCH option?
>>>
>>> My vote would be for that.
>>> Konstantin
>>
>> After some further thinking I would rather support the
>> rte_packet_prefetch() macro (prefetch the header in the driver, and
>> configure it through CONFIG_RTE_PMD_PACKET_PREFETCH)
>>
>> - the prefetch can happen earlier, so if an application needs the header
>> right away, this is the fastest
>> - if the application doesn't need header prefetch, it can turn it off
>> compile time. Same as if we wouldn't have this option.
>> - if the application has mixed needs (sometimes it needs the header
>> right away, sometimes it doesn't), it can turn it off and do what it
>> needs. Same as if we wouldn't have this option.
>>
>> A harder decision would be whether it should be turned on or off by
>> default. Currently it's on, and half of the receive functions don't use it.
>
> Yep,  it is sort of a mess right now.
> Another question if we'd like to keep it and standardise it:
> at what moment to prefetch: as soon as we realize that HW is done with that buffer,
> or as late inside rx_burst() as possible?
> I suppose there is no clear answer for that.
I think if the application needs the driver start the prefetching, it 
does it because it's already too late when rte_eth_rx_burst() returns. 
So I think it's best to do it as soon as we learn that the HW is done.

> That's why my thought was to just get rid of it.
> Though if it would be implemented in some standardized way and disabled by default -
> that's probably ok too.
> BTW, couldn't that be implemented just via rte_ethdev rx callbacks mechanism?
> Then we can have the code one place and don't need compile option at all -
> could be ebabled/disabled dynamically on a per nic basis.
> Or would it be too high overhead for that?

I think if we go that way, it's better to pass an option to 
rte_eth_rx_burst() and tell if you want the header prefetched by the 
driver. That would be the most flexible.

> Konstantin
>
>
>
>>
>>>
>>>
>>>>
>>>>>
>>>>> /Bruce
>>>


More information about the dev mailing list