[dpdk-dev] [PATCH v3 06/15] eal/soc: implement probing of drivers

Jan Viktorin viktorin at rehivetech.com
Mon Sep 19 13:34:04 CEST 2016


On Mon, 19 Sep 2016 12:17:53 +0530
Shreyansh Jain <shreyansh.jain at nxp.com> wrote:

> Hi Jan,
> 
> On Friday 16 September 2016 05:57 PM, Jan Viktorin wrote:
> > On Fri, 9 Sep 2016 14:13:50 +0530
> > Shreyansh Jain <shreyansh.jain at nxp.com> wrote:
> >  
> >> Each SoC PMD registers a set of callback for scanning its own bus/infra and
> >> matching devices to drivers when probe is called.
> >> This patch introduces the infra for calls to SoC scan on rte_eal_soc_init()
> >> and match on rte_eal_soc_probe().
> >>
> >> Patch also adds test case for scan and probe.
> >>
> >> Signed-off-by: Jan Viktorin <viktorin at rehivetech.com>
> >> Signed-off-by: Shreyansh Jain <shreyansh.jain at nxp.com>
> >> Signed-off-by: Hemant Agrawal <hemant.agrawal at nxp.com>
> >> ---
> >>  app/test/test_soc.c                             | 138 ++++++++++++++-
> >>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   4 +
> >>  lib/librte_eal/common/eal_common_soc.c          | 215 ++++++++++++++++++++++++
> >>  lib/librte_eal/common/include/rte_soc.h         |  51 ++++++
> >>  lib/librte_eal/linuxapp/eal/eal.c               |   5 +
> >>  lib/librte_eal/linuxapp/eal/eal_soc.c           |  16 ++
> >>  lib/librte_eal/linuxapp/eal/rte_eal_version.map |   4 +
> >>  7 files changed, 432 insertions(+), 1 deletion(-)
> >>

[...]

> 
> >  
> >> +static void test_soc_scan_dev0_cb(void);  
> >
> > Similar here, something like "match_by_name".
> >  
> >> +static int test_soc_match_dev0_cb(struct rte_soc_driver *drv,
> >> +				  struct rte_soc_device *dev);  
> >
> > I prefer an empty line here.  
> 
> Do we really place newlines in function declarations? That doesn't 
> really help anything, until and unless some comments are added to those. 
> Anyways, rather than added blank lines, I will add some comments - those 
> are indeed misssing.

It took me a while to parse those lines... If they are logically grouped,
it'd be ok. Comments might be helpful. However, here these are forward
declarations so it's a question whether to put comments here or to the
implementations below.

> 
> >
> >
> > ditto...  
> 
> Will add comments.
> 
> >  
> >> +static void test_soc_scan_dev1_cb(void);  
> >
> > ditto...  
> 
> Same here, I prefer comment rather than blank line.
> 
> >  

[...]

> >>
> >> +/* Test Probe (scan and match) functionality */
> >> +static int
> >> +test_soc_init_and_probe(void)  
> >
> > You say to test scan and match. I'd prefer to reflect this in the name
> > of the test. Otherwise, it seems you are testing init and probe which
> > is not true, I think.  
> 
> I agree. I will update the name of the function.
> 
> >
> > Do you test that "match principle works" or that "match functions are OK"
> > or "match functions are called as expected", ...?  
> 
> "match functions are called as expected"

OK, but there is no assert that says "yes, the match function has been called".
In other words, it is not an automatic test and it does not help to verify
that the code is working.

I think that you should test that a particular match function succeeds or not.
So again, I don't consider this to be a test. It does not verify anything.

> The model for the patchset was to allow PMDs to write their own match 
> and hence, verifying a particular match is not definitive. Rather, the 

If you want to verify a particular match implementation then there should
be a particular test verifying that implementation, e.g. test_match_compatible(),
test_match_proprietary, test_match_by_name.

However, this is testing the rte_eal_soc_probe (at least, I understand it that way).
The probe iterates over devices and drivers and matches them. Thus, the argument
"a particular match is not definitive" seems to be irrelevant here. You should build
a testing match function like "match_always" that verifies the probe is working. Not
that the "match" is working.

> test case simply confirms that a SoC based PMD would be able to 

It does not confirm anything from my point of view. You *always* print "successful"
at the end of this test (see below).

> implement its own match/scan and these would be called from EAL as expected.
> 
> >  
> >> +{
> >> +	struct rte_soc_driver *drv;
> >> +
> >> +	/* Registering dummy drivers */
> >> +	rte_eal_soc_register(&empty_pmd0.soc_drv);
> >> +	rte_eal_soc_register(&empty_pmd1.soc_drv);
> >> +	/* Assuming that test_register_unregister is working, not verifying
> >> +	 * that drivers are indeed registered
> >> +	*/
> >> +
> >> +	/* rte_eal_soc_init is called by rte_eal_init, which in turn calls the
> >> +	 * scan_fn of each driver.

So, I'd comment this as something like:

"mimic rte_eal_soc_init to prepare for the rte_eal_soc_probe"

> >> +	 */
> >> +	TAILQ_FOREACH(drv, &soc_driver_list, next) {
> >> +		if (drv && drv->scan_fn)
> >> +			drv->scan_fn();
> >> +	}  
> >
> > Here, I suppose you mimic the rte_eal_soc_init?  
> 
> Yes.
> 
> >  
> >> +
> >> +	/* rte_eal_init() would perform other inits here */
> >> +
> >> +	/* Probe would link the SoC devices<=>drivers */
> >> +	rte_eal_soc_probe();
> >> +
> >> +	/* Unregistering dummy drivers */
> >> +	rte_eal_soc_unregister(&empty_pmd0.soc_drv);
> >> +	rte_eal_soc_unregister(&empty_pmd1.soc_drv);
> >> +
> >> +	free(empty_pmd0.soc_dev.addr.name);
> >> +
> >> +	printf("%s has been successful\n", __func__);  
> >
> > How you detect it is unsuccessful? Is it possible to fail in this test?
> > A test that can never fail is in fact not a test :).  
> 
> The design assumption for SoC patcheset was: A PMDs scan is called to 
> find devices on its bus (PMD ~ bus). Whether devices are found or not, 
> is irrelevant to EAL - whether that is because of error or actually no 
> devices were available.
> With the above logic, no 'success/failure' is checked in the test. It is 
> simply a verification of EAL's ability to link the PMD with it 
> (scan/match function pointers).

I am sorry, I disagree. You always print "successful". The only way to fail
here is a SIGSEGV or other very serious system failure. But we test rte_eal_soc_probe
and not system failures.

> 
> >  
> >> +	return 0;
> >> +}
> >> +
> >>  /* save real devices and drivers until the tests finishes */

[...]

> >> diff --git a/lib/librte_eal/common/eal_common_soc.c b/lib/librte_eal/common/eal_common_soc.c
> >> index 5dcddc5..bb87a67 100644
> >> --- a/lib/librte_eal/common/eal_common_soc.c
> >> +++ b/lib/librte_eal/common/eal_common_soc.c
> >> @@ -36,6 +36,8 @@
> >>  #include <sys/queue.h>
> >>
> >>  #include <rte_log.h>
> >> +#include <rte_common.h>
> >> +#include <rte_soc.h>
> >>
> >>  #include "eal_private.h"
> >>
> >> @@ -45,6 +47,213 @@ struct soc_driver_list soc_driver_list =
> >>  struct soc_device_list soc_device_list =
> >>  	TAILQ_HEAD_INITIALIZER(soc_device_list);
> >>
> >> +/* Default SoC device<->Driver match handler function */  
> >
> > I think this comment is redundant. All this is already said in the rte_soc.h.  
> 
> Ok. I will remove it from here and if need be, update the rte_soc.h to 
> have elaborate comments.
> 
> >  
> >> +int
> >> +rte_eal_soc_match(struct rte_soc_driver *drv, struct rte_soc_device *dev)
> >> +{
> >> +	int i, j;
> >> +
> >> +	RTE_VERIFY(drv != NULL && drv->id_table != NULL);
> >> +	RTE_VERIFY(dev != NULL && dev->id != NULL);
> >> +
> >> +	for (i = 0; drv->id_table[i].compatible; ++i) {
> >> +		const char *drv_compat = drv->id_table[i].compatible;
> >> +
> >> +		for (j = 0; dev->id[j].compatible; ++j) {
> >> +			const char *dev_compat = dev->id[j].compatible;
> >> +
> >> +			if (!strcmp(drv_compat, dev_compat))
> >> +				return 0;
> >> +		}
> >> +	}
> >> +
> >> +	return 1;
> >> +}
> >> +

A redundant empty line here...

> >> +
> >> +static int
> >> +rte_eal_soc_probe_one_driver(struct rte_soc_driver *drv,
> >> +			     struct rte_soc_device *dev)
> >> +{
> >> +	int ret = 1;
> >> +  
> >
> > I think, the RTE_VERIFY(dev->match_fn) might be good here.
> > It avoids any doubts about the validity of the pointer.  
> 
> That has already been done in rte_eal_soc_register which is called when 
> PMDs are registering themselves through DRIVER_REGISTER_SOC. That would 
> prevent any PMD leaking through to this stage without a proper 
> match_fn/scan_fn.

Well, yes. It seems to be redundant. However, it would emphesize the fact
that this function expects that match_fn is set.

In the rte_eal_soc_register, the RTE_VERIFY says "The API requires those".

But when I review I do not always see all the context. It is not safe for
me to assume that there was probably some RTE_VERIFY in the path... It is
not a fast path so it does not hurt the performance in anyway.

> 
> >  
> >> +	ret = drv->match_fn(drv, dev);
> >> +	if (ret) {
> >> +		RTE_LOG(DEBUG, EAL,
> >> +			" match function failed, skipping\n");  
> >
> > Is this a failure? I think it is not. Failure would be if the match
> > function cannot execute correctly. This is more like "no-match".  
> 
> The log message is misleading. This is _not_ a failure but simply a 
> 'no-match'. I will update this.
> 
> >
> > When debugging, I'd like to see more a message like "driver <name> does not match".  
> 
> Problem would be about '<name>' of a driver. There is already another 
> discussion about SoC capability/platform bus definitions - probably I 
> will wait for that so as to define what a '<name>' for a driver and 
> device is.
> In this case, the key reason for not adding such a message was because 
> it was assumed PMDs are black boxes with EAL not even assuming what 
> '<name>' means. Anyways, it is better to discuss these things in that 
> other email.

I am not sure which thread do you mean... Can you point me there, please?

> 
> >  
> >> +		return ret;

[...]

> >> +
> >> +int
> >> +rte_eal_soc_probe_one(const struct rte_soc_addr *addr)
> >> +{
> >> +	struct rte_soc_device *dev = NULL;
> >> +	int ret = 0;
> >> +
> >> +	if (addr == NULL)
> >> +		return -1;
> >> +
> >> +	/* unlike pci, in case of soc, it the responsibility of the soc driver
> >> +	 * to check during init whether device has been updated since last add.  
> >
> > Why? Can you give a more detailed explanation?  
> 
> For this patch, I have _not_ assumed anything for a SoC's 
> bus/driver/device model. In absence of a proper standard, each SoC is 
> unique - categorizing all SoC under a platform bus, for example, would 
> only mean assuming platform bus is a standard.
> Best judge for the layout of SoC devices is the SoC PMD (which is also 
> like a bus driver, other than being a device driver).
> 
> Once again, if the discussion in other thread comes to a logical 
> conclusion, this would get updated.

Again, I am not sure which thread discusses this topic.

I just don't like the idea to leave update responsibility on PMDs.
Maybe, there can be a callback update in rte_soc_device (set or not-set
by the custom scan function) that is to be called here.

> 
> >  
> >> +	 */
> >> +
> >> +	TAILQ_FOREACH(dev, &soc_device_list, next) {
> >> +		if (rte_eal_compare_soc_addr(&dev->addr, addr))
> >> +			continue;
> >> +
> >> +		ret = soc_probe_all_drivers(dev);
> >> +		if (ret < 0)
> >> +			goto err_return;
> >> +		return 0;
> >> +	}
> >> +	return -1;
> >> +
> >> +err_return:
> >> +	RTE_LOG(WARNING, EAL,
> >> +		"Requested device %s cannot be used\n", addr->name);
> >> +	return -1;
> >> +}
> >> +
> >> +/*
> >> + * Scan the SoC devices and call the devinit() function for all registered
> >> + * drivers that have a matching entry in its id_table for discovered devices.
> >> + */  
> >
> > Should be in header. Here it is redundant.  
> 
> Ok. I will move to rte_soc.h.
> 
> >  
> >> +int
> >> +rte_eal_soc_probe(void)
> >> +{
> >> +	struct rte_soc_device *dev = NULL;
> >> +	int ret = 0;
> >> +
> >> +	TAILQ_FOREACH(dev, &soc_device_list, next) {
> >> +		ret = soc_probe_all_drivers(dev);
> >> +		if (ret < 0)
> >> +			rte_exit(EXIT_FAILURE, "Requested device %s"
> >> +				 " cannot be used\n", dev->addr.name);
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  /* dump one device */
> >>  static int
> >>  soc_dump_one_device(FILE *f, struct rte_soc_device *dev)
> >> @@ -79,6 +288,12 @@ rte_eal_soc_dump(FILE *f)
> >>  void
> >>  rte_eal_soc_register(struct rte_soc_driver *driver)
> >>  {
> >> +	/* For a valid soc driver, match and scan function
> >> +	 * should be provided.
> >> +	 */  
> >
> > This comment should be in the header file.  
> 
> Actually there is no valueable addition made by this comment. RTE_VERIFY 
> is self explanatory. I will remove the comment all together.

No, the comment must be present for the rte_eal_soc_register function as
its documentation. The RTE_VERIFY is not an excuse, it just verifies the
fact that the caller understands the documentation and that she didn't make
a mistake.

> 
> >  
> >> +	RTE_VERIFY(driver != NULL);
> >> +	RTE_VERIFY(driver->match_fn != NULL);
> >> +	RTE_VERIFY(driver->scan_fn != NULL);
> >>  	TAILQ_INSERT_TAIL(&soc_driver_list, driver, next);
> >>  }
> >>
> >> diff --git a/lib/librte_eal/common/include/rte_soc.h b/lib/librte_eal/common/include/rte_soc.h
> >> index c6f98eb..bfb49a2 100644
> >> --- a/lib/librte_eal/common/include/rte_soc.h
> >> +++ b/lib/librte_eal/common/include/rte_soc.h
> >> @@ -97,6 +97,16 @@ typedef int (soc_devinit_t)(struct rte_soc_driver *, struct rte_soc_device *);
> >>  typedef int (soc_devuninit_t)(struct rte_soc_device *);
> >>
> >>  /**
> >> + * SoC device scan callback, called from rte_eal_soc_init.  
> >
> > Can you explain what is the goal of the callback?
> > What is the expected behaviour.  
> 
> EAL would call the scan of each registered SoC PMD 
> (DRIVER_REGISTER_SOC). This scan is responsible for finding devices on 
> SoC's specific bus and add them to SoC device_list. This is a callback 
> because SoC don't have a generalization like PCI. A SoC is not 
> necessarily a platform bus either (what original patch series assumed).

In doc comment...

> 
> >
> > It returns void so it seems it can never fail. Is this correct?
> > I can image that to scan for devices, I need to check some file-system
> > structure which can be unavailable...  
> 
> This is what I had in mind:
> That is true, it never fails. It is expected that scan function simply 
> ignores (logs error) and moves ahead. A local error for a particular SoC 
> (I agree, there might not be more than one SoC) doesn't necessarily mean 
> that complete DPDK Application should quit. It only means that 
> application user should get some error/warning/message about failure.

I understand, this is OK then.

> 
> >  
> >> + */
> >> +typedef void (soc_scan_t)(void);  
> >
> > You are missing the '*' in (*soc_scan_t).  
> 
> That was put in the definition in the rte_soc_driver - but, I see you 
> have already commented there. I will add the '*' here and remove from there.
> 
> >  
> >> +
> >> +/**
> >> + * Custom device<=>driver match callback for SoC  
> >
> > Can you explain the semantics (return values), please?  
> 
> rte_soc.h already has explanation on the expected semantics over 
> rte_eal_soc_match - the default implementation. But, I agree, it should 
> be above this declaration.

True ;).

> 
> >  
> >> + */
> >> +typedef int (soc_match_t)(struct rte_soc_driver *, struct rte_soc_device *);  
> >

[...]

> 
> >
> > I think, we should tell the users that scan_fn and match_fn must be always set
> > to something.  
> 
> How? I think it would be part of documentation, isn't it?

Yes. It should be documented in the comment for rte_eal_soc_init.
My comment was misplaced a bit...

> Also, rte_eal_soc_init() already enforces this check with RTE_VERIFY.
> 
> >  
> >>  	const struct rte_soc_id *id_table; /**< ID table, NULL terminated */
> >>  };
> >>
> >> @@ -146,6 +158,45 @@ rte_eal_compare_soc_addr(const struct rte_soc_addr *a0,
> >>  }
> >>
> >>  /**
> >> + * Default function for matching the Soc driver with device. Each driver can
> >> + * either use this function or define their own soc matching function.
> >> + * This function relies on the compatible string extracted from sysfs. But,
> >> + * a SoC might have different way of identifying its devices. Such SoC can
> >> + * override match_fn.
> >> + *
> >> + * @return
> >> + * 	 0 on success
> >> + *	-1 when no match found
> >> +  */
> >> +int
> >> +rte_eal_soc_match(struct rte_soc_driver *drv, struct rte_soc_device *dev);  
> >
> > What about naming it
> >
> > 	rte_eal_soc_match_default  
> 
> Ok.
> 
> >
> > or maybe better
> >
> > 	rte_eal_soc_match_compatible
> >
> > what do you think?  
> 
>  From what I had in mind - the discussion about SoC not necessarily 
> being a Platform bus - 'compatible' doesn't look fine to me. But again, 
> it is still open debate so - I will wait until that is conlcuded.

Why? The current implementation works this way:

int
rte_eal_soc_match(struct rte_soc_driver *drv, struct rte_soc_device *dev)
{
	int i, j;

	RTE_VERIFY(drv != NULL && drv->id_table != NULL);
	RTE_VERIFY(dev != NULL && dev->id != NULL);

	for (i = 0; drv->id_table[i].compatible; ++i) {
		const char *drv_compat = drv->id_table[i].compatible;

		for (j = 0; dev->id[j].compatible; ++j) {
			const char *dev_compat = dev->id[j].compatible;

			if (!strcmp(drv_compat, dev_compat))
				return 0;
		}
	}

	return 1;
}

It checks for compatible. So why not to name it that way? If you provide
a match testing the name of devices then it can be named *_match_name.

> 
> >  
> >> +
> >> +/**
> >> + * Probe SoC devices for registered drivers.
> >> + */
> >> +int rte_eal_soc_probe(void);
> >> +
> >> +/**
> >> + * Probe the single SoC device.
> >> + */
> >> +int rte_eal_soc_probe_one(const struct rte_soc_addr *addr);
> >> +
> >> +/**
> >> + * Close the single SoC device.
> >> + *
> >> + * Scan the SoC devices and find the SoC device specified by the SoC
> >> + * address, then call the devuninit() function for registered driver
> >> + * that has a matching entry in its id_table for discovered device.
> >> + *
> >> + * @param addr
> >> + *	The SoC address to close.
> >> + * @return
> >> + *   - 0 on success.
> >> + *   - Negative on error.
> >> + */
> >> +int rte_eal_soc_detach(const struct rte_soc_addr *addr);
> >> +
> >> +/**
> >>   * Dump discovered SoC devices.
> >>   */
> >>  void rte_eal_soc_dump(FILE *f);
> >> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> >> index 15c8c3d..147b601 100644
> >> --- a/lib/librte_eal/linuxapp/eal/eal.c
> >> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> >> @@ -70,6 +70,7 @@
> >>  #include <rte_cpuflags.h>
> >>  #include <rte_interrupts.h>
> >>  #include <rte_pci.h>
> >> +#include <rte_soc.h>
> >>  #include <rte_dev.h>
> >>  #include <rte_devargs.h>
> >>  #include <rte_common.h>
> >> @@ -881,6 +882,10 @@ rte_eal_init(int argc, char **argv)
> >>  	if (rte_eal_pci_probe())
> >>  		rte_panic("Cannot probe PCI\n");
> >>
> >> +	/* Probe & Initialize SoC devices */
> >> +	if (rte_eal_soc_probe())
> >> +		rte_panic("Cannot probe SoC\n");
> >> +
> >>  	rte_eal_mcfg_complete();
> >>
> >>  	return fctret;
> >> diff --git a/lib/librte_eal/linuxapp/eal/eal_soc.c b/lib/librte_eal/linuxapp/eal/eal_soc.c
> >> index 04848b9..5f961c4 100644
> >> --- a/lib/librte_eal/linuxapp/eal/eal_soc.c
> >> +++ b/lib/librte_eal/linuxapp/eal/eal_soc.c
> >> @@ -52,5 +52,21 @@
> >>  int
> >>  rte_eal_soc_init(void)
> >>  {
> >> +	struct rte_soc_driver *drv;
> >> +
> >> +	/* for debug purposes, SoC can be disabled */
> >> +	if (internal_config.no_soc)
> >> +		return 0;
> >> +
> >> +	/* For each registered driver, call their scan routine to perform any
> >> +	 * custom scan for devices (for example, custom buses)
> >> +	 */
> >> +	TAILQ_FOREACH(drv, &soc_driver_list, next) {  
> >
> > Is it possible to have drv->scan_fn == NULL? I suppose, this is invalid.
> > I'd prefer to have RTE_VERIFY for this check.  
> 
> rte_eal_soc_init() has this check already. Driver wouldn't even be 
> registered in case scan/match are not implemented.

True, but when reviewing or refactoring, you cannot see always all this context.
It is more defensive to explain here "don't worry, the scan_fn is always set here".

> 
> >  
> >> +		if (drv && drv->scan_fn) {
> >> +			drv->scan_fn();
> >> +			/* Ignore all errors from this */
> >> +		}  
> >  
> >> +	}
> >> +
> >>  	return 0;
> >>  }
> >> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> >> index b9d1932..adcfe7d 100644
> >> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> >> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> >> @@ -179,5 +179,9 @@ DPDK_16.11 {
> >>  	rte_eal_soc_register;
> >>  	rte_eal_soc_unregister;
> >>  	rte_eal_soc_dump;
> >> +	rte_eal_soc_match;
> >> +	rte_eal_soc_detach;
> >> +	rte_eal_soc_probe;
> >> +	rte_eal_soc_probe_one;
> >>
> >>  } DPDK_16.07;  
> >
> > Regards
> > Jan
> >  
> 
> I hope I have covered all your comments. That was an exhaustive review. 
> Thanks a lot for your time.
> 
> Lets work to resolve the architectural issues revolving around SoC 
> scan/match.

;)

> 
> -
> Shreyansh



-- 
   Jan Viktorin                  E-mail: Viktorin at RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


More information about the dev mailing list