When calling tap_mp_attach_queues(), secondary assumes that if a response has arrived from the primary, the call was successful. When the primary encounters an error in tap_mp_sync_queues(), it will simply return -1 without sending a reply. There is no way for a primary to indicate that it has received a processed the request, but have encountered an error while doing so. This leads to a situation where the secondary will timeout on waiting for primary to respond if the primary has encountered failure. The correct usage of IPC is for primary to respond with a status code, and for secondary to check whether the status code matches success or failure.
Did not look is this a failure of the TAP driver or the rte_mp_XYZ APIs?
This is TAP driver using the rte_mp_* API's wrongly.
To provide some more context, rte_mp_sync_request will expect a response from the peer process. Simply returning -1 from a callback without sending a response will not, in an of itself, send an "error" message to the other side - instead, it will simply not send a response and cause the other side to timeout while waiting for said response. The response needs to be sent explicitly even in cases of failure. Perhaps this should be better documented in the EAL guide, but it's still a bug in the TAP driver :)
Here is a patch for this problem not able to test, so compile only ATM. If you see something wrong let me know. From 2c99a5f60e99a0a4c117877a680caa160d8b77a1 Mon Sep 17 00:00:00 2001 From: Keith Wiles <keith.wiles@intel.com> Date: Fri, 26 Apr 2019 10:59:40 -0500 Subject: [PATCH] tap:fix mp reply on error bug-258 To: dpdk.org Signed-off-by: Keith Wiles <keith.wiles@intel.com> --- drivers/net/tap/rte_eth_tap.c | 60 ++++++++++++++++++++--------------- 1 file changed, 34 insertions(+), 26 deletions(-) diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c index 7f74b5dc9..5dec69172 100644 --- a/drivers/net/tap/rte_eth_tap.c +++ b/drivers/net/tap/rte_eth_tap.c @@ -2099,6 +2099,7 @@ tap_mp_attach_queues(const char *port_name, struct rte_eth_dev *dev) strlcpy(request_param->port_name, port_name, sizeof(request_param->port_name)); request.len_param = sizeof(*request_param); + /* Send request and receive reply */ ret = rte_mp_request_sync(&request, &replies, &timeout); if (ret < 0 || replies.nb_received != 1) { @@ -2108,7 +2109,9 @@ tap_mp_attach_queues(const char *port_name, struct rte_eth_dev *dev) } reply = &replies.msgs[0]; reply_param = (struct ipc_queues *)reply->param; - TAP_LOG(DEBUG, "Received IPC reply for %s", reply_param->port_name); + + TAP_LOG(DEBUG, "Received IPC reply for %s num FDs %d", + reply_param->port_name, reply->num_fds); /* Attach the queues from received file descriptors */ dev->data->nb_rx_queues = reply_param->rxq_count; @@ -2134,47 +2137,52 @@ tap_mp_sync_queues(const struct rte_mp_msg *request, const void *peer) struct ipc_queues *reply_param = (struct ipc_queues *)reply.param; uint16_t port_id; - int queue; int ret; + /* Reply with 0 fds and queues on error */ + reply.num_fds = 0; + reply_param->rxq_count = 0; + reply_param->txq_count = 0; + /* Get requested port */ TAP_LOG(DEBUG, "Received IPC request for %s", request_param->port_name); + ret = rte_eth_dev_get_port_by_name(request_param->port_name, &port_id); - if (ret) { - TAP_LOG(ERR, "Failed to get port id for %s", - request_param->port_name); - return -1; - } - dev = &rte_eth_devices[port_id]; - process_private = dev->process_private; + if (ret == 0) { + dev = &rte_eth_devices[port_id]; - /* Fill file descriptors for all queues */ - reply.num_fds = 0; - reply_param->rxq_count = 0; - for (queue = 0; queue < dev->data->nb_rx_queues; queue++) { - reply.fds[reply.num_fds++] = process_private->rxq_fds[queue]; - reply_param->rxq_count++; - } - RTE_ASSERT(reply_param->rxq_count == dev->data->nb_rx_queues); - RTE_ASSERT(reply_param->txq_count == dev->data->nb_tx_queues); - RTE_ASSERT(reply.num_fds <= RTE_MP_MAX_FD_NUM); + if ((dev->data->nb_rx_queues + dev->data->nb_tx_queues) <= RTE_MP_MAX_FD_NUM) { + int queue; - reply_param->txq_count = 0; - for (queue = 0; queue < dev->data->nb_tx_queues; queue++) { - reply.fds[reply.num_fds++] = process_private->txq_fds[queue]; - reply_param->txq_count++; - } + process_private = dev->process_private; + + /* Fill file descriptors for all queues */ + for (queue = 0; queue < dev->data->nb_rx_queues; queue++) { + reply.fds[reply.num_fds++] = process_private->rxq_fds[queue]; + reply_param->rxq_count++; + } + + for (queue = 0; queue < dev->data->nb_tx_queues; queue++) { + reply.fds[reply.num_fds++] = process_private->txq_fds[queue]; + reply_param->txq_count++; + } + } else + TAP_LOG(ERR, "Num RX/TX queues greater then RTE_MP_MAX_FD_NUM\n"); + } else + TAP_LOG(ERR, "Failed to get port id for %s", + request_param->port_name); /* Send reply */ strlcpy(reply.name, request->name, sizeof(reply.name)); strlcpy(reply_param->port_name, request_param->port_name, sizeof(reply_param->port_name)); reply.len_param = sizeof(*reply_param); + if (rte_mp_reply(&reply, peer) < 0) { TAP_LOG(ERR, "Failed to reply an IPC request to sync queues"); - return -1; + ret = -1; } - return 0; + return ret; } /* Open a TAP interface device. -- 2.19.1
I don't think reviewing patches on a bug tracker is a good idea :) However, i should note that there appears to be a number of unrelated changes (removing asserts, changing how port structure is received, etc.), which makes it harder to review than it needs to be. If there needs to be some cleanup, it would be good to split it over multiple patches.
Just putting the patch here as it is untested is the only reason for the code being here as it is pretty short simple patch. As for the asserts they were not required anymore and some of the asserts would not be triggered anyway unless some fatal corruption. The section needed to be restructured to handle the different cases leaving the code would have required to do this over multiple patches. Then the patch set would have needed to be applied as a series anyway and at some point it would have looked similar to this patch IMO. I do not know what port structure you are talking about I did not change any structures? For the last patch before shipping to the list I will look at teh code and determine if it really requires more than one patch.