[dpdk-stable] [dpdk-dev] [PATCH v3] dev: don't remove devargs that are still referenced

Stojaczyk, Dariusz dariusz.stojaczyk at intel.com
Tue Nov 27 14:03:02 CET 2018



> -----Original Message-----
> From: Kevin Traynor [mailto:ktraynor at redhat.com]
> Sent: Tuesday, November 27, 2018 12:40 PM
> To: Stojaczyk, Dariusz <dariusz.stojaczyk at intel.com>; Maxime Coquelin
> <maxime.coquelin at redhat.com>
> Cc: gaetan.rivet at 6wind.com; thomas at monjalon.net; stable at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] dev: don't remove devargs that are still
> referenced
> 
> On 11/23/2018 09:45 PM, Stojaczyk, Dariusz wrote:
> >
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
> >> Sent: Friday, November 23, 2018 6:05 PM
> >> To: Stojaczyk, Dariusz <dariusz.stojaczyk at intel.com>; dev at dpdk.org
> >> Cc: gaetan.rivet at 6wind.com; thomas at monjalon.net
> >> Subject: Re: [dpdk-dev] [PATCH v3] dev: don't remove devargs that are
> still
> >> referenced
> >>
> >> Hi,
> >>
> >> On 11/23/18 4:43 PM, Darek Stojaczyk wrote:
> >>> Even if a device failed to plug, it's still a device
> >>> object that references the devargs. Those devargs will
> >>> be freed automatically together with the device, but
> >>> freeing them any earlier - like it's done in the hotplug
> >>> error handling path right now - will give us a dangling
> >>> pointer and a segfault scenario.
> >>>
> >>> Consider the following case:
> >>>   * secondary process receives the hotplug request IPC message
> >>>     * devargs are either created or updated
> >>>     * the bus is scanned
> >>>       * a new device object is created with the latest devargs
> >>>     * the device can't be plugged for whatever reason,
> >>>       bus->plug returns error
> >>>       * the devargs are freed, even though they're still referenced
> >>>         by the device object on the bus
> >>>
> >>> For PCI devices, the generic device name comes from
> >>> a buffer within the devargs. Freeing those will make
> >>> EAL segfault whenever the device name is checked.
> >>>
> >>> This patch just prevents the hotplug error handling
> >>> path from removing the devargs when there's a device
> >>> that references them. This is done by simply exiting
> >>> early from the hotplug function. As mentioned in the
> >>> beginning, those devargs will be freed later, together
> >>> with the device itself.
> >>>
> >>> Fixes: 7e8b26650146 ("eal: fix hotplug add / remove")
> >>
> >> Should you also cc stable?
> >> Above commit is in since v17.08.
> >>
> >
> > Hi Maxime,
> >
> > Stable could use a similar patch, but not exactly this one as it is now. I'll
> resubmit for stable once the one here gets approved.
> >
> 
> Hi Darek, feel free to send patch to stable at dpdk.org with [18.08]
> subject prefix, now that this is applied on master, thanks.

Kevin, I just tried rebasing this patch on 18.08, but the relevant code in 18.08 is just too much of a mess and I prefer not to touch it.
Sorry,
D.

> 
> 
> > Thank you,
> > D.
> >
> >>> Cc: gaetan.rivet at 6wind.com
> >>> Cc: thomas at monjalon.net
> >>>
> >>> Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk at intel.com>
> >>> ---
> >>> Changes since v2:
> >>>   * added an extra comment (Gaetan)
> >>>
> >>> Changes since v1:
> >>>   * described the failing scenario in commit msg (Thomas)
> >>>
> >>>   lib/librte_eal/common/eal_common_dev.c | 13 ++++++++-----
> >>>   1 file changed, 8 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/lib/librte_eal/common/eal_common_dev.c
> >> b/lib/librte_eal/common/eal_common_dev.c
> >>> index 1fdc9ab17..d7950bc9a 100644
> >>> --- a/lib/librte_eal/common/eal_common_dev.c
> >>> +++ b/lib/librte_eal/common/eal_common_dev.c
> >>> @@ -166,14 +166,17 @@ local_dev_probe(const char *devargs, struct
> >> rte_device **new_dev)
> >>>   		ret = -ENODEV;
> >>>   		goto err_devarg;
> >>>   	}
> >>> +	/* Since there is a matching device, it is now its responsibility
> >>> +	 * to manage the devargs we've just inserted. From this point
> >>> +	 * those devargs shouldn't be removed manually anymore.
> >>> +	 */
> >>>
> >>>   	ret = dev->bus->plug(dev);
> >>>   	if (ret) {
> >>> -		if (rte_dev_is_probed(dev)) /* if already succeeded earlier
> >> */
> >>> -			return ret; /* no rollback */
> >>> -		RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n",
> >>> -			dev->name);
> >>> -		goto err_devarg;
> >>> +		if (!rte_dev_is_probed(dev)) /* if hasn't succeeded earlier */
> >>> +			RTE_LOG(ERR, EAL, "Driver cannot attach the device
> >> (%s)\n",
> >>> +				dev->name);
> >>> +		return ret;
> >>>   	}
> >>>
> >>>   	*new_dev = dev;
> >>>
> >>
> >> Other than that, it looks good to me:
> >> Acked-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> >>
> >> Regards,
> >> Maxime



More information about the stable mailing list