[2/3] devargs: delay freeing previous devargs when overriding them

Message ID 20181105070447.67700-2-dariusz.stojaczyk@intel.com (mailing list archive)
State Rejected, archived
Delegated to: Thomas Monjalon
Headers
Series [1/3] bus/pci: update device devargs on each rescan |

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Stojaczyk, Dariusz Nov. 5, 2018, 7:04 a.m. UTC
  In eal hotplug path, the previous devargs may be still
referenced by device structs at the time the rte_devargs_insert()
is called. Those references are updated almost immediately
afterwards, but in cases something goes wrong and they cannot
be updated, we might want to still keep the old devargs around.

This patch modifies rte_devargs_insert() so that it returns
a pointer to previous devargs that are being overridden. In
case something in the EAL hotplug path goes wrong, we can now
remove the newly inserted devargs and re-insert the old ones.

Note: Functional changes will come in a subsequent patch. This
      one only extends the API.

Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
---
 drivers/bus/vdev/vdev.c                     | 12 +++++++++---
 lib/librte_eal/common/eal_common_dev.c      | 11 +++++++----
 lib/librte_eal/common/eal_common_devargs.c  | 21 ++++++++++++++-------
 lib/librte_eal/common/include/rte_devargs.h |  8 ++------
 4 files changed, 32 insertions(+), 20 deletions(-)
  

Comments

Thomas Monjalon Nov. 5, 2018, 7:30 a.m. UTC | #1
Hi,

05/11/2018 08:04, Darek Stojaczyk:
> -int __rte_experimental
> -rte_devargs_insert(struct rte_devargs *da)
> +void __rte_experimental
> +rte_devargs_insert(struct rte_devargs *da, struct rte_devargs **prev_da)

You should update the API section of the release notes.

>  {
> -       int ret;
> +       struct rte_devargs *d;
> +       void *tmp;
> +
> +       *prev_da = NULL;
> +       TAILQ_FOREACH_SAFE(d, &devargs_list, next, tmp) {
> +               if (strcmp(d->bus->name, da->bus->name) == 0 &&
> +                   strcmp(d->name, da->name) == 0) {
> +                       TAILQ_REMOVE(&devargs_list, d, next);
> +                       *prev_da = d;
> +                       break;
> +               }
> +       }
>  
> -       ret = rte_devargs_remove(da);
> -       if (ret < 0)
> -               return ret;

Why not updating rte_devargs_remove instead of duplicating its code?
  
Stojaczyk, Dariusz Nov. 5, 2018, 8:25 a.m. UTC | #2
Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Monday, November 5, 2018 8:31 AM
> To: Stojaczyk, Dariusz <dariusz.stojaczyk@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [PATCH 2/3] devargs: delay freeing previous devargs when
> overriding them
> 
> Hi,
> 
> 05/11/2018 08:04, Darek Stojaczyk:
> > -int __rte_experimental
> > -rte_devargs_insert(struct rte_devargs *da)
> > +void __rte_experimental
> > +rte_devargs_insert(struct rte_devargs *da, struct rte_devargs
> **prev_da)
> 
> You should update the API section of the release notes.

Even for experimental API? OK, I didn't know it's needed.

> 
> >  {
> > -       int ret;
> > +       struct rte_devargs *d;
> > +       void *tmp;
> > +
> > +       *prev_da = NULL;
> > +       TAILQ_FOREACH_SAFE(d, &devargs_list, next, tmp) {
> > +               if (strcmp(d->bus->name, da->bus->name) == 0 &&
> > +                   strcmp(d->name, da->name) == 0) {
> > +                       TAILQ_REMOVE(&devargs_list, d, next);
> > +                       *prev_da = d;
> > +                       break;
> > +               }
> > +       }
> >
> > -       ret = rte_devargs_remove(da);
> > -       if (ret < 0)
> > -               return ret;
> 
> Why not updating rte_devargs_remove instead of duplicating its code?
> 

We still want to preserve the functionality of rte_devargs_remove.
rte_devargs_remove does TAILQ_REMOVE + free; rte_devargs_insert does just TAILQ_REMOVE. (I think I also forgot to update rte_devargs_insert documentation, I'll  do that in V2)

Since you've mentioned it:
Eventually I'd see rte_devargs_remove to accept the exact same devargs parameter that was passed to rte_devargs_insert. Then rte_devargs_remove wouldn't do any sort of lookup. Maybe additional rte_devargs_find(const char *name) could be added for existing cases where the original devargs struct is not available. However, I'm not familiar enough with this code to perform the refactor and am just trying to fix the stuff. Still, how does it sound?

D.
  
Thomas Monjalon Nov. 5, 2018, 9:46 a.m. UTC | #3
05/11/2018 09:25, Stojaczyk, Dariusz:
> Hi Thomas,
> 
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 05/11/2018 08:04, Darek Stojaczyk:
> > > -int __rte_experimental
> > > -rte_devargs_insert(struct rte_devargs *da)
> > > +void __rte_experimental
> > > +rte_devargs_insert(struct rte_devargs *da, struct rte_devargs
> > **prev_da)
> > 
> > You should update the API section of the release notes.
> 
> Even for experimental API? OK, I didn't know it's needed.

Yes, even for experimental API, the API changes must documented.

> > >  {
> > > -       int ret;
> > > +       struct rte_devargs *d;
> > > +       void *tmp;
> > > +
> > > +       *prev_da = NULL;
> > > +       TAILQ_FOREACH_SAFE(d, &devargs_list, next, tmp) {
> > > +               if (strcmp(d->bus->name, da->bus->name) == 0 &&
> > > +                   strcmp(d->name, da->name) == 0) {
> > > +                       TAILQ_REMOVE(&devargs_list, d, next);
> > > +                       *prev_da = d;
> > > +                       break;
> > > +               }
> > > +       }
> > >
> > > -       ret = rte_devargs_remove(da);
> > > -       if (ret < 0)
> > > -               return ret;
> > 
> > Why not updating rte_devargs_remove instead of duplicating its code?
> 
> We still want to preserve the functionality of rte_devargs_remove.
> rte_devargs_remove does TAILQ_REMOVE + free; rte_devargs_insert does just TAILQ_REMOVE. (I think I also forgot to update rte_devargs_insert documentation, I'll  do that in V2)

Yes, because of the rollback, OK.
Please mention in devargs_insert doc that the old devargs
can be used for rollback.

> Since you've mentioned it:
> Eventually I'd see rte_devargs_remove to accept the exact same devargs parameter that was passed to rte_devargs_insert. Then rte_devargs_remove wouldn't do any sort of lookup. Maybe additional rte_devargs_find(const char *name) could be added for existing cases where the original devargs struct is not available. However, I'm not familiar enough with this code to perform the refactor and am just trying to fix the stuff. Still, how does it sound?

I think we can keep it as is.

We can re-think the whole thing in the next release.
I think we should not play with devargs list as we do.
There should be only lists for scanned devices of each bus and that's all.
  
Gaëtan Rivet Nov. 5, 2018, 4:24 p.m. UTC | #4
On Mon, Nov 05, 2018 at 10:46:39AM +0100, Thomas Monjalon wrote:
> 05/11/2018 09:25, Stojaczyk, Dariusz:
> > Hi Thomas,
> > 
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > 05/11/2018 08:04, Darek Stojaczyk:
> > > > -int __rte_experimental
> > > > -rte_devargs_insert(struct rte_devargs *da)
> > > > +void __rte_experimental
> > > > +rte_devargs_insert(struct rte_devargs *da, struct rte_devargs
> > > **prev_da)
> > > 
> > > You should update the API section of the release notes.
> > 
> > Even for experimental API? OK, I didn't know it's needed.
> 
> Yes, even for experimental API, the API changes must documented.
> 
> > > >  {
> > > > -       int ret;
> > > > +       struct rte_devargs *d;
> > > > +       void *tmp;
> > > > +
> > > > +       *prev_da = NULL;
> > > > +       TAILQ_FOREACH_SAFE(d, &devargs_list, next, tmp) {
> > > > +               if (strcmp(d->bus->name, da->bus->name) == 0 &&
> > > > +                   strcmp(d->name, da->name) == 0) {
> > > > +                       TAILQ_REMOVE(&devargs_list, d, next);
> > > > +                       *prev_da = d;
> > > > +                       break;
> > > > +               }
> > > > +       }
> > > >
> > > > -       ret = rte_devargs_remove(da);
> > > > -       if (ret < 0)
> > > > -               return ret;
> > > 
> > > Why not updating rte_devargs_remove instead of duplicating its code?
> > 
> > We still want to preserve the functionality of rte_devargs_remove.
> > rte_devargs_remove does TAILQ_REMOVE + free; rte_devargs_insert does just TAILQ_REMOVE. (I think I also forgot to update rte_devargs_insert documentation, I'll  do that in V2)
> 
> Yes, because of the rollback, OK.
> Please mention in devargs_insert doc that the old devargs
> can be used for rollback.
> 

This is not ok. You are leaking those devargs that are not freed.
Once a devargs is inserted, the current API considers them to be managed
by EAL - i.e. no one else should manipulate them (get them out of the
list or freeing them). They are meant afterward to be linked by
rte_device, so the only moment they can be removed is right before a
bus scan (which will rebuild the devargs -> device mapping).

Returning the prev_devargs from devargs_insert() is a bad construct.
Insert only does insertion, not search+insert.

Either, devargs_insert() finds a previous occurence, and returns an
error EEXIST before aborting, or it forces the insertion. But it should
not do something else to avoid a function call.

> > Since you've mentioned it:
> > Eventually I'd see rte_devargs_remove to accept the exact same devargs parameter that was passed to rte_devargs_insert. Then rte_devargs_remove wouldn't do any sort of lookup. Maybe additional rte_devargs_find(const char *name) could be added for existing cases where the original devargs struct is not available. However, I'm not familiar enough with this code to perform the refactor and am just trying to fix the stuff. Still, how does it sound?
> 
> I think we can keep it as is.
> 
> We can re-think the whole thing in the next release.
> I think we should not play with devargs list as we do.
> There should be only lists for scanned devices of each bus and that's all.
> 
> 
> 

devargs_{insert,remove} currently identifies a devargs from its
(bus, name) pair. This pair is unique in the list. Accepting the exact
same devargs parameter is not possible, because sometimes you won't have
this devargs handle, you can only identify it by the pair above.

I think devargs_find() would give the wrong idea (returning a devargs
handle that is still in the devargs list: a user should not think it can
be manipulated this way).

Instead I'd propose:

devargs_insert() returns -EEXIST if the devargs identifying pair is found
                 in the list.
devargs_extract() will remove this devargs from the list and return it
                  if found. devargs_insert() afterward would work.
                  Once extracted, the devargs is the responsibility of
                  the caller (so should be freed if not needed anymore).
devargs_remove() stays the same, finding the devargs and freeing it, or
                 returning >0 if the devargs was not within the list.

Remaining issue is that extract() would invalidate a bunch of rte_device
pointers, which is not acceptable. Either a back-reference is made from
devargs to device, so that rte_devices are updated accordingly, or we must
ensure each device bus scan is called right after to make sure the mapping
is fixed.
  

Patch

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 9c66bdc78..bbdae2314 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -9,6 +9,7 @@ 
 #include <stdint.h>
 #include <stdbool.h>
 #include <sys/queue.h>
+#include <assert.h>
 
 #include <rte_eal.h>
 #include <rte_dev.h>
@@ -207,7 +208,7 @@  insert_vdev(const char *name, const char *args,
 		bool init)
 {
 	struct rte_vdev_device *dev;
-	struct rte_devargs *devargs;
+	struct rte_devargs *devargs, *prev_devargs;
 	int ret;
 
 	if (name == NULL)
@@ -239,8 +240,13 @@  insert_vdev(const char *name, const char *args,
 	}
 
 	TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
-	if (init)
-		rte_devargs_insert(devargs);
+	if (init) {
+		rte_devargs_insert(devargs, &prev_devargs);
+		 /* any previous devargs should have been caught by the above
+		  * find_vdev(name) check
+		  */
+		assert(prev_devargs == NULL);
+	}
 
 	if (p_dev)
 		*p_dev = dev;
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 62e9ed477..4cb424df1 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -131,7 +131,7 @@  int
 local_dev_probe(const char *devargs, struct rte_device **new_dev)
 {
 	struct rte_device *dev;
-	struct rte_devargs *da;
+	struct rte_devargs *da, *prev_da;
 	int ret;
 
 	*new_dev = NULL;
@@ -150,9 +150,12 @@  local_dev_probe(const char *devargs, struct rte_device **new_dev)
 		goto err_devarg;
 	}
 
-	ret = rte_devargs_insert(da);
-	if (ret)
-		goto err_devarg;
+	rte_devargs_insert(da, &prev_da);
+
+	if (prev_da != NULL) {
+		free(prev_da->args);
+		free(prev_da);
+	}
 
 	ret = da->bus->scan();
 	if (ret)
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index b7b9cb69e..c46365b69 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -262,16 +262,23 @@  rte_devargs_parsef(struct rte_devargs *da, const char *format, ...)
 	return ret;
 }
 
-int __rte_experimental
-rte_devargs_insert(struct rte_devargs *da)
+void __rte_experimental
+rte_devargs_insert(struct rte_devargs *da, struct rte_devargs **prev_da)
 {
-	int ret;
+	struct rte_devargs *d;
+	void *tmp;
+
+	*prev_da = NULL;
+	TAILQ_FOREACH_SAFE(d, &devargs_list, next, tmp) {
+		if (strcmp(d->bus->name, da->bus->name) == 0 &&
+		    strcmp(d->name, da->name) == 0) {
+			TAILQ_REMOVE(&devargs_list, d, next);
+			*prev_da = d;
+			break;
+		}
+	}
 
-	ret = rte_devargs_remove(da);
-	if (ret < 0)
-		return ret;
 	TAILQ_INSERT_TAIL(&devargs_list, da, next);
-	return 0;
 }
 
 /* store a whitelist parameter for later parsing */
diff --git a/lib/librte_eal/common/include/rte_devargs.h b/lib/librte_eal/common/include/rte_devargs.h
index b1f121f83..753cf7f54 100644
--- a/lib/librte_eal/common/include/rte_devargs.h
+++ b/lib/librte_eal/common/include/rte_devargs.h
@@ -146,14 +146,10 @@  __attribute__((format(printf, 2, 0)));
  *
  * @param da
  *  The devargs structure to insert.
- *
- * @return
- *   - 0 on success
- *   - Negative on error.
  */
 __rte_experimental
-int
-rte_devargs_insert(struct rte_devargs *da);
+void
+rte_devargs_insert(struct rte_devargs *da, struct rte_devargs **prev_da);
 
 /**
  * Add a device to the user device list