[dpdk-dev] [PATCH 3/3] app/testpmd: fix callback issue for hot-unplug

Matan Azrad matan at mellanox.com
Thu Nov 8 10:35:16 CET 2018



From: Jeff Guo 
> On 11/8/2018 3:28 PM, Matan Azrad wrote:
> >
> > From: Ananyev, Konstantin
> >>> -----Original Message-----
> >>> From: Guo, Jia
> >>> Sent: Wednesday, November 7, 2018 7:30 AM
> >>> To: Matan Azrad <matan at mellanox.com>; Ananyev, Konstantin
> >>> <konstantin.ananyev at intel.com>; Burakov, Anatoly
> >>> <anatoly.burakov at intel.com>; Thomas Monjalon
> >> <thomas at monjalon.net>;
> >>> Iremonger, Bernard <bernard.iremonger at intel.com>; Wu, Jingjing
> >>> <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>
> >>> Cc: Yigit, Ferruh <ferruh.yigit at intel.com>; dev at dpdk.org; Zhang,
> >>> Helin <helin.zhang at intel.com>; He, Shaopeng
> <shaopeng.he at intel.com>
> >>> Subject: Re: [PATCH 3/3] app/testpmd: fix callback issue for
> >>> hot-unplug
> >>>
> >>> matan
> >>>
> >>> On 11/6/2018 2:36 PM, Matan Azrad wrote:
> >>>> Hi Jeff
> >>>>
> >>>>    From: Jeff Guo <jia.guo at intel.com>
> >>>>> Before detach device when device be hot-unplugged, the failure
> >>>>> process in user space and kernel space both need to be finished,
> >>>>> such as eal interrupt callback need to be inactive before the
> >>>>> callback be unregistered when device is being cleaned. This patch
> >>>>> add rte alarm for device detaching, with that it could finish
> >>>>> interrupt callback soon and give time to let the failure process
> >>>>> done
> >> before detaching.
> >>>>> Fixes: 2049c5113fe8 ("app/testpmd: use hotplug failure handler")
> >>>>> Signed-off-by: Jeff Guo <jia.guo at intel.com>
> >>>>> ---
> >>>>>    app/test-pmd/testpmd.c | 13 ++++++++++++-
> >>>>>    1 file changed, 12 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>>>> 9c0edca..9c673cf 100644
> >>>>> --- a/app/test-pmd/testpmd.c
> >>>>> +++ b/app/test-pmd/testpmd.c
> >>>>> @@ -2620,7 +2620,18 @@ eth_dev_event_callback(const char
> >>>>> *device_name, enum rte_dev_event_type type,
> >>>>>    				device_name);
> >>>>>    			return;
> >>>>>    		}
> >>>>> -		rmv_event_callback((void *)(intptr_t)port_id);
> >>>>> +		/*
> >>>>> +		 * Before detach device, the hot-unplug failure
> process in
> >>>>> +		 * user space and kernel space both need to be
> finished,
> >>>>> +		 * such as eal interrupt callback need to be inactive
> before
> >>>>> +		 * the callback be unregistered when device is being
> cleaned.
> >>>>> +		 * So finished interrupt callback soon here and give
> time to
> >>>>> +		 * let the work done before detaching.
> >>>>> +		 */
> >>>>> +		if (rte_eal_alarm_set(100000,
> >>>>> +				rmv_event_callback, (void
> >>>>> *)(intptr_t)port_id))
> >>>>> +			RTE_LOG(ERR, EAL,
> >>>>> +				"Could not set up deferred device
> >>>> It looks me strange to use callback and alarm to remove a device:
> >>>> Why not to use callback and that is it?
> >>>>
> >>>> I think that it's better to let the EAL to detach the device after
> >>>> all the
> >> callbacks were done and not to do it by the user callback.
> >>>> So the application\callback owners just need to clean its resources
> >>>> with understanding that after the callback the device(and the
> >>>> callback
> >>> itself) will be detached by the EAL.
> >>>
> >>>
> >>> Firstly, at the currently framework and solution, such as callback
> >>> for RTE_ETH_EVENT_INTR_RMV, still need to use the deferred device
> >> removal,
> >>> we tend to give the control of detaching device to the application,
> >>> and the whole process is located on the user's callback. Notify app
> >>> to detach device by callback but make it deferred, i think it is fine.
> > But the device must be detached in remove event, why not to do it in EAL?
> 
> 
> I think it because of before detached the device, application should be stop
> the forwarding at first, then stop the device, then close
> 
> the device, finally call eal unplug API to detach device. If eal directly detach
> device at the first step, there will be mountain user space error need to
> handle, so that is one reason why need to provider the remove notification
> to app, and let app to process it.


This is why the EAL need to detach the device only after all the user callbacks were done.

> 
> 
> >> It is also unclear to me my we need an alarm here.
> >> First (probably wrong) impression we just try to hide some
> >> synchronization Problem by introducing delay.
> > Looks like, the issue is that the callback function memory will be removed
> from the function itself (by the detach call), no?
> 
> 
> Answer here for both Konstantin and Matan.
> 
> Yes, i think matan is right, the interrupt callback will be destroy in the app
> callback itself, the sequence is that as below
> 
> hot-unplug interrupt -> interrupt callback -> app callback(return to finish
> interrupt callback, delay detaching) -> detach device(unregister interrupt
> callback)
> 
> 
> >> Konstantin
> >>
> >>> Secondly, the vfio is different with igb_uio for hot-unplug, it
> >>> register/unregister hotplug interrupt callback for each device, so
> >>> need to make  the callback done before unregister the callback.
> >>>
> >>> So I think it should be considerate as an workaround here, before we
> >>> find a better way.
> >>>
> >>>
> >>>>> removal\n");
> >>>>>    		break;
> >>>>>    	case RTE_DEV_EVENT_ADD:
> >>>>>    		RTE_LOG(ERR, EAL, "The device: %s has been
> added!\n",
> >>>>> --
> >>>>> 2.7.4


More information about the dev mailing list