[dpdk-dev,v2] eal: don't fail secondary if primary is missing tailqs

Message ID 20161005174734.GC12182@labs.hpe.com (mailing list archive)
State Rejected, archived
Delegated to: Thomas Monjalon
Headers

Commit Message

Jean Tourrilhes Oct. 5, 2016, 5:47 p.m. UTC
  If the primary and secondary process were build using different build
systems, the list of constructors included by the linker in each
binary might be different. Tailqs are registered via constructors, so
the linker magic will directly impact which tailqs are registered with
the primary and the secondary.

DPDK currently assumes that the secondary has a subset of the tailqs
registered at the primary. In some build scenario, the secondary might
register a tailq that the primary did not register. In this case,
instead of exiting with a panic, just unregister the offending tailq
and allow the secondary to run.

Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
---
 lib/librte_eal/common/eal_common_tailqs.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)
  

Comments

Ferruh Yigit Dec. 21, 2018, 3:50 p.m. UTC | #1
On 10/5/2016 6:47 PM, jt at labs.hpe.com (Jean Tourrilhes) wrote:
> If the primary and secondary process were build using different build
> systems, the list of constructors included by the linker in each
> binary might be different. Tailqs are registered via constructors, so
> the linker magic will directly impact which tailqs are registered with
> the primary and the secondary.
> 
> DPDK currently assumes that the secondary has a subset of the tailqs
> registered at the primary. In some build scenario, the secondary might
> register a tailq that the primary did not register. In this case,
> instead of exiting with a panic, just unregister the offending tailq
> and allow the secondary to run.
> 
> Signed-off-by: Jean Tourrilhes <jt at labs.hpe.com>

A lot changed in multiprocess support in last two years, updating status of this
patch as 'Rejected', if the issue is still valid can you please either send a
new version or report the issue in bugzilla?

Thanks,
ferruh
  

Patch

diff --git a/lib/librte_eal/common/eal_common_tailqs.c b/lib/librte_eal/common/eal_common_tailqs.c
index bb08ec8..cf5a771 100644
--- a/lib/librte_eal/common/eal_common_tailqs.c
+++ b/lib/librte_eal/common/eal_common_tailqs.c
@@ -143,6 +143,8 @@  rte_eal_tailq_update(struct rte_tailq_elem *t)
 		t->head = rte_eal_tailq_create(t->name);
 	} else {
 		t->head = rte_eal_tailq_lookup(t->name);
+		if (t->head != NULL)
+			rte_tailqs_count++;
 	}
 }
 
@@ -178,19 +180,27 @@  int
 rte_eal_tailqs_init(void)
 {
 	struct rte_tailq_elem *t;
+	void *tmp_te;
 
 	rte_tailqs_count = 0;
 
-	TAILQ_FOREACH(t, &rte_tailq_elem_head, next) {
+	TAILQ_FOREACH_SAFE(t, &rte_tailq_elem_head, next, tmp_te) {
 		/* second part of register job for "early" tailqs, see
 		 * rte_eal_tailq_register and EAL_REGISTER_TAILQ */
 		rte_eal_tailq_update(t);
 		if (t->head == NULL) {
 			RTE_LOG(ERR, EAL,
 				"Cannot initialize tailq: %s\n", t->name);
-			/* no need to TAILQ_REMOVE, we are going to panic in
-			 * rte_eal_init() */
-			goto fail;
+			if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+				/* no need to TAILQ_REMOVE, we are going
+				 * to panic in rte_eal_init() */
+				goto fail;
+			} else {
+				/* This means our list of constructor is
+				 * no the same as primary. Just remove
+				 * that missing tailq and continue */
+				TAILQ_REMOVE(&rte_tailq_elem_head, t, next);
+			}
 		}
 	}