[dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists

Ivan Boule ivan.boule at 6wind.com
Wed Jun 15 17:33:18 CEST 2016


On 06/15/2016 04:27 PM, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: Richardson, Bruce
>> Sent: Wednesday, June 15, 2016 3:22 PM
>> To: Ananyev, Konstantin
>> Cc: Ivan Boule; Thomas Monjalon; Pattan, Reshma; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>
>> On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
>>>> Sent: Wednesday, June 15, 2016 3:07 PM
>>>> To: Richardson, Bruce; Ananyev, Konstantin
>>>> Cc: Thomas Monjalon; Pattan, Reshma; dev at dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>>
>>>> On 06/15/2016 03:29 PM, Bruce Richardson wrote:
>>>>> On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
>>>>>> Hi Ivan,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
>>>>>>> Sent: Wednesday, June 15, 2016 1:15 PM
>>>>>>> To: Thomas Monjalon; Ananyev, Konstantin
>>>>>>> Cc: Pattan, Reshma; dev at dpdk.org
>>>>>>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
>>>>>>>
>>>>>>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think the read access would need locking but we do not want it
>>>>>>>>>> in fast path.
>>>>>>>>>
>>>>>>>>> I don't think it would be needed.
>>>>>>>>> As I said - read/write interaction didn't change from what we have right now.
>>>>>>>>> But if you have some particular scenario in mind that you believe would cause
>>>>>>>>> a race condition - please speak up.
>>>>>>>>
>>>>>>>> If we add/remove a callback during a burst? Is it possible that the next
>>>>>>>> pointer would have a wrong value leading to a crash?
>>>>>>>> Maybe we need a comment to state that we should not alter burst
>>>>>>>> callbacks while running burst functions.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Reshma,
>>>>>>>
>>>>>>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
>>>>>>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
>>>>>>> of RX callbacks associated with the polled RX queue to safely remove RX
>>>>>>> callback(s) in parallel.
>>>>>>> The problem is not [only] with the setting and the loading of "cb->next"
>>>>>>> that you assume to be atomic operations, which is certainly true on most
>>>>>>> CPUs.
>>>>>>> I see the 2 important following issues:
>>>>>>>
>>>>>>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
>>>>>>> RX callback could still be accessed in the callback parsing loop of the
>>>>>>> function "rte_eth_rx_burst()" after having been freed in parallel.
>>>>>>>
>>>>>>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
>>>>>>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
>>>>>>> does not free the "rte_eth_rxtx_callback" data structure associated with
>>>>>>> the removed callback !
>>>>>>
>>>>>> Yes, though it is documented behaviour, someone can probably
>>>>>> refer it as a feature, not a bug ;)
>>>>>>
>>>>>
>>>>> +1
>>>>> This is definitely not a bug, this is absolutely by design. One may argue with
>>>>> the design, but it was done for a definite reason, so as to avoid paying the
>>>>> penalty of having locks. It pushes more responsibility onto the app, but it
>>>>> does allow the app to choose the best solution for managing the freeing of
>>>>> memory for its situation. The alternative is to force all apps to pay the cost
>>>>> of having locks, even if better options for freeing the memory are available.
>>>>>
>>>>> /Bruce
>>>>>
>>>>
>>>> -1 (not to say 0xFFFFFFFF)
>>>>
>>>> This is definitely an API design bug !
>>>> I would say that if you don't know how to free a resource that you
>>>> allocate, it is very likely that you are wrong allocating it.
>>>> And this is exactly what happens here with RX/TX callback data structures.
>>>> This problem can easily be addressed by just changing the API as follows:
>>>>
>>>> Change
>>>>       void *
>>>>       rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>>>>                               rte_rx_callback_fn fn, void *user_param)
>>>>
>>>> to
>>>>       int
>>>>       rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>>>>                               struct rte_eth_rxtx_callback *cb)
>>>>
>>>> In addition of solving the problem, this approach makes the API
>>>> consistent and let the application allocate "rte_eth_rxtx_callback" data
>>>> structures through any appropriate mean.
>>>
>>> That might make API a bit cleaner, but I don't see how it fixes the generic problem:
>>> there is still no way to know by the upper layer when it is safe to free/re-use
>>> removed callback, but to make sure that all IO on that queue is stopped
>>> (I.E. some external synchronisation around the queue).
>>
>> Actually, it allows other, more creative solutions, like an app having a
>> statically allocated set/pool of callback structures that it doesn't need to
>> allocate or free at all.
>
> I understand that, and as I said I am not against that change.
> But it doesn't solve the problem in general.
> Ok, if you have a static object, you wouldn't need to call free() for it,
> but you still want to re-use it after remove_cb() has finished, right?
> So there is no actual difference - the main question is:
> at what point after remove_cb() it is safe to modify it's contents.
> Konstantin
>
>>
>> /Bruce
>
Hi Bruce/Konstantin

There is no need to use locks to ensure that a RX callback being removed 
is not executed/invoked and will never be invoked again.
This issue can be easily addressed at application level through the 
common synchronization mechanism that consists in having the control 
thread of the application that manages the adding/removal of RX 
callbacks to:
1) Ask the packet processing function of the application to stop 
invoking the function rte_eth_rx_burst() on the target RX queue.
2) Once the packet processing function acked it stopped polling the 
target RX queue, safely remove the RX callback and free whatever 
resource needs to be freed.
3) Enable the packet processing function of the application to invoke 
again the function rte_eth_rx_burst() on the target RX queue.

Enjoy :-)

Ivan



-- 
Ivan Boule
6WIND Development Engineer


More information about the dev mailing list