[dpdk-dev] [PATCH 3/3] app/testpmd: fix callback issue for hot-unplug

Jeff Guo jia.guo at intel.com
Fri Nov 9 07:17:07 CET 2018


On 11/9/2018 1:24 PM, Matan Azrad wrote:
>
>   From: Jeff Guo
>> On 11/8/2018 5:35 PM, Matan Azrad wrote:
>>> From: Jeff Guo
>>>> On 11/8/2018 3:28 PM, Matan Azrad wrote:
>>>>> From: Ananyev, Konstantin
>>>>>>> -----Original Message-----
>>>>>>> From: Guo, Jia
>>>>>>> Sent: Wednesday, November 7, 2018 7:30 AM
>>>>>>> To: Matan Azrad <matan at mellanox.com>; Ananyev, Konstantin
>>>>>>> <konstantin.ananyev at intel.com>; Burakov, Anatoly
>>>>>>> <anatoly.burakov at intel.com>; Thomas Monjalon
>>>>>> <thomas at monjalon.net>;
>>>>>>> Iremonger, Bernard <bernard.iremonger at intel.com>; Wu, Jingjing
>>>>>>> <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>
>>>>>>> Cc: Yigit, Ferruh <ferruh.yigit at intel.com>; dev at dpdk.org; Zhang,
>>>>>>> Helin <helin.zhang at intel.com>; He, Shaopeng
>>>> <shaopeng.he at intel.com>
>>>>>>> Subject: Re: [PATCH 3/3] app/testpmd: fix callback issue for
>>>>>>> hot-unplug
>>>>>>>
>>>>>>> matan
>>>>>>>
>>>>>>> On 11/6/2018 2:36 PM, Matan Azrad wrote:
>>>>>>>> Hi Jeff
>>>>>>>>
>>>>>>>>      From: Jeff Guo <jia.guo at intel.com>
>>>>>>>>> Before detach device when device be hot-unplugged, the failure
>>>>>>>>> process in user space and kernel space both need to be finished,
>>>>>>>>> such as eal interrupt callback need to be inactive before the
>>>>>>>>> callback be unregistered when device is being cleaned. This
>>>>>>>>> patch add rte alarm for device detaching, with that it could
>>>>>>>>> finish interrupt callback soon and give time to let the failure
>>>>>>>>> process done
>>>>>> before detaching.
>>>>>>>>> Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure handler")
>>>>>>>>> Signed-off-by: Jeff Guo <jia.guo at intel.com>
>>>>>>>>> ---
>>>>>>>>>      app/test-pmd/testpmd.c | 13 ++++++++++++-
>>>>>>>>>      1 file changed, 12 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>>>>>>>>> index 9c0edca..9c673cf 100644
>>>>>>>>> --- a/app/test-pmd/testpmd.c
>>>>>>>>> +++ b/app/test-pmd/testpmd.c
>>>>>>>>> @@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char
>>>>>>>>> *device_name, enum rte_dev_event_type type,
>>>>>>>>>      				device_name);
>>>>>>>>>      			return;
>>>>>>>>>      		}
>>>>>>>>> -		rmv_event_callback((void *)(intptr_t)port_id);
>>>>>>>>> +		/*
>>>>>>>>> +		 * Before detach device, the hot-unplug failure
>>>> process in
>>>>>>>>> +		 * user space and kernel space both need to be
>>>> finished,
>>>>>>>>> +		 * such as eal interrupt callback need to be inactive
>>>> before
>>>>>>>>> +		 * the callback be unregistered when device is being
>>>> cleaned.
>>>>>>>>> +		 * So finished interrupt callback soon here and give
>>>> time to
>>>>>>>>> +		 * let the work done before detaching.
>>>>>>>>> +		 */
>>>>>>>>> +		if (rte_eal_alarm_set(100000,
>>>>>>>>> +				rmv_event_callback, (void
>>>>>>>>> *)(intptr_t)port_id))
>>>>>>>>> +			RTE_LOG(ERR, EAL,
>>>>>>>>> +				"Could not set up deferred device
>>>>>>>> It looks me strange to use callback and alarm to remove a device:
>>>>>>>> Why not to use callback and that is it?
>>>>>>>>
>>>>>>>> I think that it's better to let the EAL to detach the device
>>>>>>>> after all the
>>>>>> callbacks were done and not to do it by the user callback.
>>>>>>>> So the application\callback owners just need to clean its
>>>>>>>> resources with understanding that after the callback the
>>>>>>>> device(and the callback
>>>>>>> itself) will be detached by the EAL.
>>>>>>>
>>>>>>>
>>>>>>> Firstly, at the currently framework and solution, such as callback
>>>>>>> for RTE_ETH_EVENT_INTR_RMV, still need to use the deferred device
>>>>>> removal,
>>>>>>> we tend to give the control of detaching device to the
>>>>>>> application, and the whole process is located on the user's
>>>>>>> callback. Notify app to detach device by callback but make it deferred,
>> i think it is fine.
>>>>> But the device must be detached in remove event, why not to do it in
>> EAL?
>>>> I think it because of before detached the device, application should
>>>> be stop the forwarding at first, then stop the device, then close
>>>>
>>>> the device, finally call eal unplug API to detach device. If eal
>>>> directly detach device at the first step, there will be mountain user
>>>> space error need to handle, so that is one reason why need to
>>>> provider the remove notification to app, and let app to process it.
>>> This is why the EAL need to detach the device only after all the user
>> callbacks were done.
>>
>>
>> If i correctly got your meaning, you suppose to let eal to mandatory detach
>> device but not app, app just need to stop/close port, right?
> Yes, the app should stop,close,clean its own resources of the removed device,
> Then, EAL to detach the device.
>
>> If so, it will need to modify rmv_event_callback by not invoke the detaching
>> func and add some detaching logic to hotplug framework in eal.
>>
> rmv_event_callback is using by other hotplug mechanism (ETHDEV RMV event), so you need to use another one\ add parameter to it.
> And yes, you need to detach the device from EAL, should be simple.


I think rmv_event_callback is original use for other hotplug event 
(ETHDEV RMV event), but it still use the common hotplug mechanism(app 
callback and app detach),

so i think it will still need to face this callback issue and you could 
check that eth_event_callback also use the rte alarm to detach device.

so my suggestion is that, you maybe propose a good idea but let we keep 
on current mechanism until we come to a final good solution agreement, 
before that, just let

it functional.


>> It is hardly say better or worse but this new propose need to discuss more in
>> public. And it might be better to use another specific patch to handler it.
>> What do you or other guys think so?
> Since you are fixing issue here, it can be done by a fix series.
>
> Other feedbacks are welcome all the time 😊
>
>>
>>>>>> It is also unclear to me my we need an alarm here.
>>>>>> First (probably wrong) impression we just try to hide some
>>>>>> synchronization Problem by introducing delay.
>>>>> Looks like, the issue is that the callback function memory will be
>>>>> removed
>>>> from the function itself (by the detach call), no?
>>>>
>>>>
>>>> Answer here for both Konstantin and Matan.
>>>>
>>>> Yes, i think matan is right, the interrupt callback will be destroy
>>>> in the app callback itself, the sequence is that as below
>>>>
>>>> hot-unplug interrupt -> interrupt callback -> app callback(return to
>>>> finish interrupt callback, delay detaching) -> detach
>>>> device(unregister interrupt
>>>> callback)
>>>>
>>>>
>>>>>> Konstantin
>>>>>>
>>>>>>> Secondly, the vfio is different with igb_uio for hot-unplug, it
>>>>>>> register/unregister hotplug interrupt callback for each device, so
>>>>>>> need to make  the callback done before unregister the callback.
>>>>>>>
>>>>>>> So I think it should be considerate as an workaround here, before
>>>>>>> we find a better way.
>>>>>>>
>>>>>>>
>>>>>>>>> removal\n");
>>>>>>>>>      		break;
>>>>>>>>>      	case RTE_DEV_EVENT_ADD:
>>>>>>>>>      		RTE_LOG(ERR, EAL, "The device: %s has been
>>>> added!\n",
>>>>>>>>> --
>>>>>>>>> 2.7.4


More information about the dev mailing list