[dpdk-dev,v2,1/6] ethdev: fix port data mismatched in multiple process model

Message ID 1482922962-21036-2-git-send-email-yuanhan.liu@linux.intel.com (mailing list archive)
State Superseded, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Yuanhan Liu Dec. 28, 2016, 11:02 a.m. UTC
  Assume we have two virtio ports, 00:03.0 and 00:04.0. The first one is
managed by the kernel driver, while the later one is managed by DPDK.

Now we start the primary process. 00:03.0 will be skipped by DPDK virtio
PMD driver (since it's being used by the kernel). 00:04.0 would be
successfully initiated by DPDK virtio PMD (if nothing abnormal happens).
After that, we would get a port id 0, and all the related info needed
by virtio (virtio_hw) is stored at rte_eth_dev_data[0].

Then we start the secondary process. As usual, 00:03.0 will be firstly
probed. It firstly tries to get a local eth_dev structure for it (by
rte_eth_dev_allocate):

        port_id = rte_eth_dev_find_free_port();
        ...

        eth_dev = &rte_eth_devices[port_id];
        eth_dev->data = &rte_eth_dev_data[port_id];
        ...

        return eth_dev;

Since it's a first PCI device, port_id will be 0. eth_dev->data would
then point to rte_eth_dev_data[0]. And here things start going wrong,
as rte_eth_dev_data[0] actually stores the virtio_hw for 00:04.0.

That said, in the secondary process, DPDK will continue to drive PCI
device 00.03.0 (despite the fact it's been managed by kernel), with
the info from PCI device 00:04.0. Which is wrong.

The fix is to attach the port already registered by the primary process:
iterate the rte_eth_dev_data[], and get the port id who's PCI ID matches
the current PCI device.

This would let us maintain same port ID for the same PCI device, keeping
the chance of referencing to wrong data minimal.

Fixes: af75078fece3 ("first public release")

Cc: stable@dpdk.org
Cc: Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
 lib/librte_ether/rte_ethdev.c | 58 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 52 insertions(+), 6 deletions(-)
  

Comments

Thomas Monjalon Jan. 4, 2017, 5:34 p.m. UTC | #1
+Cc Sergio (maintainer of the secondary process thing)

2016-12-28 19:02, Yuanhan Liu:
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -201,9 +201,6 @@ rte_eth_dev_allocate(const char *name)
>  		return NULL;
>  	}
>  
> -	if (rte_eth_dev_data == NULL)
> -		rte_eth_dev_data_alloc();
> -

It is dangerous to move this to rte_eth_dev_pci_probe.
Please keep it here and duplicate it in eth_dev_attach.

[...]
> +/*
> + * Attach to a port already registered by the primary process, which
> + * makes sure that the same device would both have the same port id
> + * in the primary and secondary process.
> + */
> +static struct rte_eth_dev *
> +eth_dev_attach(const char *name)

Maybe that the word "secondary" could help to differentiate of
the function rte_eth_dev_attach().

> +{
> +	uint8_t i;
> +	struct rte_eth_dev *eth_dev;
> +
> +	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> +		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
> +			break;
> +	}
> +	if (i == RTE_MAX_ETHPORTS) {
> +		RTE_PMD_DEBUG_TRACE(
> +			"device %s is not driven by the primary process\n",
> +			name);
> +		return NULL;
> +	}
> +
> +	RTE_ASSERT(eth_dev->data->port_id == i);
> +
> +	eth_dev = &rte_eth_devices[i];
> +	eth_dev->data = &rte_eth_dev_data[i];
> +	eth_dev->attached = DEV_ATTACHED;
> +	nb_ports++;

I am a bit nervous when I see these lines duplicated from rte_eth_dev_allocate.
Not sure whether it deserves a common function or not.

[...]
> @@ -246,9 +275,26 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
> -	eth_dev = rte_eth_dev_allocate(ethdev_name);
> -	if (eth_dev == NULL)
> -		return -ENOMEM;
> +	if (rte_eth_dev_data == NULL)
> +		rte_eth_dev_data_alloc();
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		eth_dev = rte_eth_dev_allocate(ethdev_name);
> +		if (eth_dev == NULL)
> +			return -ENOMEM;
> +	} else {
> +		/*
> +		 * if we failed to attach a device, it means that
> +		 * device is skipped, due to some errors. Take
> +		 * virtio-net device as example, it could be the
> +		 * device is managed by virtio-net kernel driver.
> +		 * For such case, we return a positive value, to
> +		 * let EAL skip it as well.
> +		 */

This comment (a bit too long) should be placed between "if" and "return".

> +		eth_dev = eth_dev_attach(ethdev_name);
> +		if (eth_dev == NULL)
> +			return 1;
> +	}
  
Yuanhan Liu Jan. 5, 2017, 6:25 a.m. UTC | #2
On Wed, Jan 04, 2017 at 06:34:40PM +0100, Thomas Monjalon wrote:
> +Cc Sergio (maintainer of the secondary process thing)
> 
> 2016-12-28 19:02, Yuanhan Liu:
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -201,9 +201,6 @@ rte_eth_dev_allocate(const char *name)
> >  		return NULL;
> >  	}
> >  
> > -	if (rte_eth_dev_data == NULL)
> > -		rte_eth_dev_data_alloc();
> > -
> 
> It is dangerous to move this to rte_eth_dev_pci_probe.
> Please keep it here and duplicate it in eth_dev_attach.

Oh, right, I missed the fact that it could be invoked from other places.

> [...]
> > +/*
> > + * Attach to a port already registered by the primary process, which
> > + * makes sure that the same device would both have the same port id
> > + * in the primary and secondary process.
> > + */
> > +static struct rte_eth_dev *
> > +eth_dev_attach(const char *name)
> 
> Maybe that the word "secondary" could help to differentiate of
> the function rte_eth_dev_attach().

Yes, it's better. How about "_attach_secondary", or "_attach_to_primary"?

> > +{
> > +	uint8_t i;
> > +	struct rte_eth_dev *eth_dev;
> > +
> > +	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > +		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
> > +			break;
> > +	}
> > +	if (i == RTE_MAX_ETHPORTS) {
> > +		RTE_PMD_DEBUG_TRACE(
> > +			"device %s is not driven by the primary process\n",
> > +			name);
> > +		return NULL;
> > +	}
> > +
> > +	RTE_ASSERT(eth_dev->data->port_id == i);
> > +
> > +	eth_dev = &rte_eth_devices[i];
> > +	eth_dev->data = &rte_eth_dev_data[i];
> > +	eth_dev->attached = DEV_ATTACHED;
> > +	nb_ports++;
> 
> I am a bit nervous when I see these lines duplicated from rte_eth_dev_allocate.
> Not sure whether it deserves a common function or not.

I don't think so, they do share some common assignments, but the assignments
are actually not the same. The primary one has few more: notably, they are:

- eth_dev->data
- eth_dev->data->name
- eth_dev->data->port_id

> 
> [...]
> > @@ -246,9 +275,26 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
> > -	eth_dev = rte_eth_dev_allocate(ethdev_name);
> > -	if (eth_dev == NULL)
> > -		return -ENOMEM;
> > +	if (rte_eth_dev_data == NULL)
> > +		rte_eth_dev_data_alloc();
> > +
> > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > +		eth_dev = rte_eth_dev_allocate(ethdev_name);
> > +		if (eth_dev == NULL)
> > +			return -ENOMEM;
> > +	} else {
> > +		/*
> > +		 * if we failed to attach a device, it means that
> > +		 * device is skipped, due to some errors. Take
> > +		 * virtio-net device as example, it could be the
> > +		 * device is managed by virtio-net kernel driver.
> > +		 * For such case, we return a positive value, to
> > +		 * let EAL skip it as well.
> > +		 */
> 
> This comment (a bit too long) should be placed between "if" and "return".

Okay.

	--yliu
  

Patch

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fde8112..c10eb9c 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -201,9 +201,6 @@  rte_eth_dev_allocate(const char *name)
 		return NULL;
 	}
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
-
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
 				name);
@@ -231,6 +228,38 @@  rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
+/*
+ * Attach to a port already registered by the primary process, which
+ * makes sure that the same device would both have the same port id
+ * in the primary and secondary process.
+ */
+static struct rte_eth_dev *
+eth_dev_attach(const char *name)
+{
+	uint8_t i;
+	struct rte_eth_dev *eth_dev;
+
+	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
+			break;
+	}
+	if (i == RTE_MAX_ETHPORTS) {
+		RTE_PMD_DEBUG_TRACE(
+			"device %s is not driven by the primary process\n",
+			name);
+		return NULL;
+	}
+
+	RTE_ASSERT(eth_dev->data->port_id == i);
+
+	eth_dev = &rte_eth_devices[i];
+	eth_dev->data = &rte_eth_dev_data[i];
+	eth_dev->attached = DEV_ATTACHED;
+	nb_ports++;
+
+	return eth_dev;
+}
+
 int
 rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 		      struct rte_pci_device *pci_dev)
@@ -246,9 +275,26 @@  rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 	rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
 			sizeof(ethdev_name));
 
-	eth_dev = rte_eth_dev_allocate(ethdev_name);
-	if (eth_dev == NULL)
-		return -ENOMEM;
+	if (rte_eth_dev_data == NULL)
+		rte_eth_dev_data_alloc();
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		eth_dev = rte_eth_dev_allocate(ethdev_name);
+		if (eth_dev == NULL)
+			return -ENOMEM;
+	} else {
+		/*
+		 * if we failed to attach a device, it means that
+		 * device is skipped, due to some errors. Take
+		 * virtio-net device as example, it could be the
+		 * device is managed by virtio-net kernel driver.
+		 * For such case, we return a positive value, to
+		 * let EAL skip it as well.
+		 */
+		eth_dev = eth_dev_attach(ethdev_name);
+		if (eth_dev == NULL)
+			return 1;
+	}
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 		eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",