[dpdk-dev] net/tap: driver closing tx interface on queue setup

Message ID 20170129021205.36860-1-keith.wiles@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Wiles, Keith Jan. 29, 2017, 2:12 a.m. UTC
  The tap driver setup both rx and tx file descriptors when the
rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
was called.

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
---
 drivers/net/tap/rte_eth_tap.c | 48 ++++++++++++++++++++++++++++++-------------
 1 file changed, 34 insertions(+), 14 deletions(-)
  

Comments

Ferruh Yigit Jan. 30, 2017, 11 a.m. UTC | #1
On 1/29/2017 2:12 AM, Keith Wiles wrote:
> The tap driver setup both rx and tx file descriptors when the
> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
> was called.

Can you please describe the problem more.
Without this patch rx->fd == tx->fd, with this patch rx and tx has
different file descriptors.

What was the wrong with rx and tx having same fd?

As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
function will do nothing if rx or tx has valid fd.

> 
> Signed-off-by: Keith Wiles <keith.wiles@intel.com>

<...>
  
Wiles, Keith Jan. 30, 2017, 2:34 p.m. UTC | #2
> On Jan 30, 2017, at 5:00 AM, Yigit, Ferruh <ferruh.yigit@intel.com> wrote:
> 
> On 1/29/2017 2:12 AM, Keith Wiles wrote:
>> The tap driver setup both rx and tx file descriptors when the
>> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
>> was called.
> 
> Can you please describe the problem more.
> Without this patch rx->fd == tx->fd, with this patch rx and tx has
> different file descriptors.

Let me look at this more, I am getting the same FD for both. Must be something else going on.

> 
> What was the wrong with rx and tx having same fd?
> 
> As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
> function will do nothing if rx or tx has valid fd.

The rte_eth_rx_queue_setup() look at line 1146 if rxq has a value then release it, which happens on both Rx/Tx code

	rxq = dev->data->rx_queues;
	if (rxq[rx_queue_id]) {
		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
					-ENOTSUP);
		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
		rxq[rx_queue_id] = NULL;
	}

	if (rx_conf == NULL)
		rx_conf = &dev_info.default_rxconf;

	ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
					      socket_id, rx_conf, mp);

> 
>> 
>> Signed-off-by: Keith Wiles <keith.wiles@intel.com>
> 
> <...>

Regards,
Keith
  
Pascal Mazon Jan. 30, 2017, 2:38 p.m. UTC | #3
On 01/30/2017 12:00 PM, Ferruh Yigit wrote:> On 1/29/2017 2:12 AM, Keith 
Wiles wrote:
>> The tap driver setup both rx and tx file descriptors when the
>> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
>> was called.
>
> Can you please describe the problem more.
> Without this patch rx->fd == tx->fd, with this patch rx and tx has
> different file descriptors.
>
> What was the wrong with rx and tx having same fd?
>
> As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
> function will do nothing if rx or tx has valid fd.
>
>>
>> Signed-off-by: Keith Wiles <keith.wiles@intel.com>
>
> <...>
>

Hi,

The tap PMD recently broke for me because of this patch [1].

During init (eth_dev_tap_create()), the tap PMD allocates a shared RX/TX 
queue through tun_alloc().
The recent patch now releases existing queues in rx_queue_setup(), 
before adding new ones.

When rx_queue_setup() is called, it uses close() calls on all shared 
queues, effectively deleting the netdevice.
That's the main issue here.

I tested Keith's patch [2], and it fixes that issue, using separate queues.

There is however a couple of other queues-related issues in the tap PMD, 
but I'm not sure how to address them properly:

1. internals->fds[] gets filled only with RX queues (appart from index 0 
that is common to both RX and TX).
    It means that RX queues only will be deleted when calling 
rte_pmd_tap_remove() or tap_tx_queue_release().

2. tap_dev_stop() is not symmetrical with tap_dev_start(): queues won't 
get re-created after a stop.

It may be best to keep the very first fd (created with tun_alloc() in 
eth_dev_tap_create() during probe) apart.
And then add separate TX/RX queues in internals->txq[] and 
internals->rxq[] respectively.
What do you think?

[1] d00d7cc88335 ("ethdev: release queue before setting up")
[2] http://dpdk.org/ml/archives/dev/2017-January/056470.html
  
Wiles, Keith Jan. 30, 2017, 3:04 p.m. UTC | #4
> On Jan 30, 2017, at 8:38 AM, Pascal Mazon <pascal.mazon@6wind.com> wrote:
> 
> On 01/30/2017 12:00 PM, Ferruh Yigit wrote:> On 1/29/2017 2:12 AM, Keith Wiles wrote:
>>> The tap driver setup both rx and tx file descriptors when the
>>> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
>>> was called.
>> 
>> Can you please describe the problem more.
>> Without this patch rx->fd == tx->fd, with this patch rx and tx has
>> different file descriptors.
>> 
>> What was the wrong with rx and tx having same fd?
>> 
>> As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
>> function will do nothing if rx or tx has valid fd.
>> 
>>> 
>>> Signed-off-by: Keith Wiles <keith.wiles@intel.com>
>> 
>> <...>
>> 
> 
> Hi,
> 
> The tap PMD recently broke for me because of this patch [1].
> 
> During init (eth_dev_tap_create()), the tap PMD allocates a shared RX/TX queue through tun_alloc().
> The recent patch now releases existing queues in rx_queue_setup(), before adding new ones.
> 
> When rx_queue_setup() is called, it uses close() calls on all shared queues, effectively deleting the netdevice.
> That's the main issue here.
> 
> I tested Keith's patch [2], and it fixes that issue, using separate queues.
> 
> There is however a couple of other queues-related issues in the tap PMD, but I'm not sure how to address them properly:
> 
> 1. internals->fds[] gets filled only with RX queues (appart from index 0 that is common to both RX and TX).
>   It means that RX queues only will be deleted when calling rte_pmd_tap_remove() or tap_tx_queue_release().
> 
> 2. tap_dev_stop() is not symmetrical with tap_dev_start(): queues won't get re-created after a stop.
> 
> It may be best to keep the very first fd (created with tun_alloc() in eth_dev_tap_create() during probe) apart.
> And then add separate TX/RX queues in internals->txq[] and internals->rxq[] respectively.
> What do you think?
> 
> [1] d00d7cc88335 ("ethdev: release queue before setting up")
> [2] http://dpdk.org/ml/archives/dev/2017-January/056470.html

Lets keep the current patch just to get over the current problem if everyone agrees. I will address the comments Pascal brings up as a later updated to the TAP PMD or I can try to get the other issues cleaned up.

> 
> 
> -- 
> Pascal Mazon
> www.6wind.com

Regards,
Keith
  
Ferruh Yigit Jan. 30, 2017, 5:19 p.m. UTC | #5
On 1/30/2017 2:38 PM, Pascal Mazon wrote:
> On 01/30/2017 12:00 PM, Ferruh Yigit wrote:> On 1/29/2017 2:12 AM, Keith 
> Wiles wrote:
>>> The tap driver setup both rx and tx file descriptors when the
>>> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
>>> was called.
>>
>> Can you please describe the problem more.
>> Without this patch rx->fd == tx->fd, with this patch rx and tx has
>> different file descriptors.
>>
>> What was the wrong with rx and tx having same fd?
>>
>> As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
>> function will do nothing if rx or tx has valid fd.
>>
>>>
>>> Signed-off-by: Keith Wiles <keith.wiles@intel.com>
>>
>> <...>
>>
> 
> Hi,
> 
> The tap PMD recently broke for me because of this patch [1].
> 
> During init (eth_dev_tap_create()), the tap PMD allocates a shared RX/TX 
> queue through tun_alloc().
> The recent patch now releases existing queues in rx_queue_setup(), 
> before adding new ones.
> 
> When rx_queue_setup() is called, it uses close() calls on all shared 
> queues, effectively deleting the netdevice.
> That's the main issue here.
> 
> I tested Keith's patch [2], and it fixes that issue, using separate queues.

Thanks for the clarification, and I am adding following patch to patch [2]:

Tested-by: Pascal Mazon <pascal.mazon@6wind.com>

<...>

> 
> [1] d00d7cc88335 ("ethdev: release queue before setting up")
> [2] http://dpdk.org/ml/archives/dev/2017-January/056470.html
> 
>
  
Ferruh Yigit Jan. 30, 2017, 5:42 p.m. UTC | #6
On 1/30/2017 2:34 PM, Wiles, Keith wrote:
> 
>> On Jan 30, 2017, at 5:00 AM, Yigit, Ferruh <ferruh.yigit@intel.com> wrote:
>>
>> On 1/29/2017 2:12 AM, Keith Wiles wrote:
>>> The tap driver setup both rx and tx file descriptors when the
>>> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
>>> was called.
>>
>> Can you please describe the problem more.
>> Without this patch rx->fd == tx->fd, with this patch rx and tx has
>> different file descriptors.
> 
> Let me look at this more, I am getting the same FD for both. Must be something else going on.

After patch, tun_alloc() called twice, one for Rx_q and other for Tx_q.
And tun_alloc does open() to "/dev/net/tun", I expect they get different
file descriptors.

And if they have same FD, won't this cause same problem,
rx_queue_setup() will close the FD, if Tx_q has same FD it will have
invalid descriptor.

> 
>>
>> What was the wrong with rx and tx having same fd?
>>
>> As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
>> function will do nothing if rx or tx has valid fd.
> 
> The rte_eth_rx_queue_setup() look at line 1146 if rxq has a value then release it, which happens on both Rx/Tx code
> 
> 	rxq = dev->data->rx_queues;
> 	if (rxq[rx_queue_id]) {
> 		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> 					-ENOTSUP);
> 		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> 		rxq[rx_queue_id] = NULL;
> 	}

Got it thanks, I missed (relatively new) above code piece.

> 
> 	if (rx_conf == NULL)
> 		rx_conf = &dev_info.default_rxconf;
> 
> 	ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
> 					      socket_id, rx_conf, mp);
> 
>>
>>>
>>> Signed-off-by: Keith Wiles <keith.wiles@intel.com>
>>
>> <...>
> 
> Regards,
> Keith
>
  
Wiles, Keith Jan. 30, 2017, 6:20 p.m. UTC | #7
> On Jan 30, 2017, at 11:42 AM, Yigit, Ferruh <ferruh.yigit@intel.com> wrote:
> 
> On 1/30/2017 2:34 PM, Wiles, Keith wrote:
>> 
>>> On Jan 30, 2017, at 5:00 AM, Yigit, Ferruh <ferruh.yigit@intel.com> wrote:
>>> 
>>> On 1/29/2017 2:12 AM, Keith Wiles wrote:
>>>> The tap driver setup both rx and tx file descriptors when the
>>>> rte_eth_rx_queue_setup() causing the tx to be closed when tx setup
>>>> was called.
>>> 
>>> Can you please describe the problem more.
>>> Without this patch rx->fd == tx->fd, with this patch rx and tx has
>>> different file descriptors.
>> 
>> Let me look at this more, I am getting the same FD for both. Must be something else going on.
> 
> After patch, tun_alloc() called twice, one for Rx_q and other for Tx_q.
> And tun_alloc does open() to "/dev/net/tun", I expect they get different
> file descriptors.

It is not called twice, it is only called once in the eth_dev_tap_create() routine and the fd is placed in the rxq/txq using the same fd. Then look in the rx/tx_setup_queue routines only update the fd and call tun_alloc if the fd is -1. Now looking at this code it seems a bit silly, but it was trying to deal with the setting up the new queue. It seems to be this logic not going to work with multiple queues in the same device and needs to be reworked.

I need to rework the code and do some cleanup. The current patch should work for a single queue per device.

Thanks

> 
> And if they have same FD, won't this cause same problem,
> rx_queue_setup() will close the FD, if Tx_q has same FD it will have
> invalid descriptor.
> 
>> 
>>> 
>>> What was the wrong with rx and tx having same fd?
>>> 
>>> As far as I can see, rte_eth_rx_queue_setup() won't close tx->fd, that
>>> function will do nothing if rx or tx has valid fd.
>> 
>> The rte_eth_rx_queue_setup() look at line 1146 if rxq has a value then release it, which happens on both Rx/Tx code
>> 
>> 	rxq = dev->data->rx_queues;
>> 	if (rxq[rx_queue_id]) {
>> 		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
>> 					-ENOTSUP);
>> 		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
>> 		rxq[rx_queue_id] = NULL;
>> 	}
> 
> Got it thanks, I missed (relatively new) above code piece.
> 
>> 
>> 	if (rx_conf == NULL)
>> 		rx_conf = &dev_info.default_rxconf;
>> 
>> 	ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
>> 					      socket_id, rx_conf, mp);
>> 
>>> 
>>>> 
>>>> Signed-off-by: Keith Wiles <keith.wiles@intel.com>
>>> 
>>> <...>
>> 
>> Regards,
>> Keith

Regards,
Keith
  

Patch

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index c0afc2d..267b421 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -406,32 +406,52 @@  tap_link_update(struct rte_eth_dev *dev __rte_unused,
 }
 
 static int
-tap_setup_queue(struct rte_eth_dev *dev,
+rx_setup_queue(struct rte_eth_dev *dev,
 		struct pmd_internals *internals,
 		uint16_t qid)
 {
 	struct rx_queue *rx = &internals->rxq[qid];
-	struct tx_queue *tx = &internals->txq[qid];
 	int fd;
 
 	fd = rx->fd;
 	if (fd < 0) {
-		fd = tx->fd;
+		RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid %d\n",
+			dev->data->name, qid);
+		fd = tun_alloc(dev->data->name);
 		if (fd < 0) {
-			RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid %d\n",
-				dev->data->name, qid);
-			fd = tun_alloc(dev->data->name);
-			if (fd < 0) {
-				RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
-					dev->data->name);
-				return -1;
-			}
+			RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
+				dev->data->name);
+			return -1;
 		}
 	}
 	dev->data->rx_queues[qid] = rx;
-	dev->data->tx_queues[qid] = tx;
 
 	rx->fd = fd;
+
+	return fd;
+}
+
+static int
+tx_setup_queue(struct rte_eth_dev *dev,
+		struct pmd_internals *internals,
+		uint16_t qid)
+{
+	struct tx_queue *tx = &internals->txq[qid];
+	int fd;
+
+	fd = tx->fd;
+	if (fd < 0) {
+		RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid %d\n",
+			dev->data->name, qid);
+		fd = tun_alloc(dev->data->name);
+		if (fd < 0) {
+			RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
+				dev->data->name);
+			return -1;
+		}
+	}
+	dev->data->tx_queues[qid] = tx;
+
 	tx->fd = fd;
 
 	return fd;
@@ -469,7 +489,7 @@  tap_rx_queue_setup(struct rte_eth_dev *dev,
 		return -ENOMEM;
 	}
 
-	fd = tap_setup_queue(dev, internals, rx_queue_id);
+	fd = rx_setup_queue(dev, internals, rx_queue_id);
 	if (fd == -1)
 		return -1;
 
@@ -493,7 +513,7 @@  tap_tx_queue_setup(struct rte_eth_dev *dev,
 	if (tx_queue_id >= internals->nb_queues)
 		return -1;
 
-	ret = tap_setup_queue(dev, internals, tx_queue_id);
+	ret = tx_setup_queue(dev, internals, tx_queue_id);
 	if (ret == -1)
 		return -1;