[dpdk-dev] [PATCH 2/5] ethdev: add port ownership

Neil Horman nhorman at tuxdriver.com
Fri Dec 29 17:56:53 CET 2017


On Sat, Dec 23, 2017 at 10:36:34PM +0000, Matan Azrad wrote:
> Hi 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Friday, December 22, 2017 4:27 PM
> > To: Matan Azrad <matan at mellanox.com>
> > Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Bruce
> > Richardson <bruce.richardson at intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev at intel.com>; Gaëtan Rivet <gaetan.rivet at 6wind.com>;
> > Wu, Jingjing <jingjing.wu at intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > Sent: Thursday, December 21, 2017 10:14 PM
> > > > To: Matan Azrad <matan at mellanox.com>
> > > > Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Bruce
> > > > Richardson <bruce.richardson at intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev at intel.com>; Gaëtan Rivet
> > > > <gaetan.rivet at 6wind.com>; Wu, Jingjing <jingjing.wu at intel.com>
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > > > Hi
> > > > >
> > > <snip>
> > > > > > > > > I think we need to clearly describe what is the
> > > > > > > > > tread-safety policy in DPDK (especially in ethdev as a first
> > example).
> > > > > > > > > Let's start with obvious things:
> > > > > > > > >
> > > > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > > > 			- no planned change because of performance
> > > > > > purpose
> > > > > > > > > 	2/ The list of devices is racy
> > > > > > > > > 			- to be fixed with atomics
> > > > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > > > 			- the configurations are different per-device
> > > > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > > > 			- can be managed by the owner of the device
> > > > > > > > > 	5/ The device ownership is racy
> > > > > > > > > 			- to be fixed with atomics
> > > > > > > > >
> > > > > > > > > What am I missing?
> > > > > > > > >
> > > > >
> > > > > Thank you Thomas for this order.
> > > > > Actually the port ownership is a good opportunity to redefine the
> > > > > synchronization rules in ethdev :)
> > > > >
> > > > > > > > There is fan out to consider here:
> > > > > > > >
> > > > > > > > 1) Is device configuration racy with ownership?  That is to
> > > > > > > > say, can I change ownership of a device safely while another
> > > > > > > > thread that currently owns it modifies its configuration?
> > > > > > >
> > > > > > > If an entity steals ownership to another one, either it is
> > > > > > > agreed earlier, or it is done by a central authority.
> > > > > > > When it is acked that ownership can be moved, there should not
> > > > > > > be any configuration in progress.
> > > > > > > So it is more a communication issue than a race.
> > > > > > >
> > > > > > But if thats the case (specifically that mutual exclusion
> > > > > > between port ownership and configuration is an exercize left to
> > > > > > an application developer), then port ownership itself is largely
> > > > > > meaningless within the dpdk, because the notion of who owns the
> > > > > > port needs to be codified within the application anyway.
> > > > > >
> > > > >
> > > > > Bruce, As I understand it, only the dpdk entity who took ownership
> > > > > of a
> > > > port successfully can configure the device by default, if other dpdk
> > > > entities want to configure it too they must to be synchronized with
> > > > the port owner while it is not recommended after the port ownership
> > integration.
> > > > >
> > > > Can you clarify what you mean by "it is not recommended after the
> > > > port ownership integration"?
> > >
> > > Sure,
> > > The new defining of ethdev synchronization doesn't recommend to
> > manage a port by 2 different dpdk entities, it can be done but not
> > recommended.
> > >
> > Ok, thats just not what you said above.  Your suggestion made it sound like
> > you thought that  after the integration of a port ownership model, that
> > multiple dpdk entries should not synchronize with one another, which made
> > no sense.
> > 
> Ok, I can see a dual meaning in my sentence, sorry for that, I think we agree here.
> 
No need to apologize, just trying to make sure we're on the same page, and yes,
I agree that we are in consensus here.

> > >
> > > > Can you elaborate on what you mean by level here?  Are you
> > > > envisioning a scheme in which multiple execution contexts might own
> > > > a port for various non-conflicting purposes?
> > >
> > > Sure,
> > > 1) Application with 2 threads wanting to configure the same port:
> > > 	level = application code.
> > >
> > > 	a. The main thread should create owner
> > identifier(rte_eth_dev_owner_new).
> > > 	b. The main thread should take the port
> > ownership(rte_eth_dev_owner_set).
> > > 	c. Synchronization between the two threads should be done for the
> > conflicted 		configurations by the application.
> > > 	d. when the application finishes the port usage it should release the
> > owner(rte_eth_dev_owner_remove).
> > >
> > > 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for
> > hotplug detections which can configure the 2 ports(by the host thread).
> > > 	Level = fail-safe code.
> > > 	a. Application starts the eal and the fail-safe driver probing function is
> > called.
> > > 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes
> > ownership for them.
> > > 	c. Failsafe creates itself port and leaves it ownerless.
> > > 	d. Failsafe starts the hotplug alarm mechanism.
> > > 	e. Application tries to take ownership for all ports and success only
> > for failsafe port.
> > > 	f. Application start to configure the failsafe port asynchronously to
> > failsafe hotplug alarm.
> > > 	g. Failsafe must use synchronization between failsafe alarm callback
> > code and failsafe configuration APIs called by the application because they
> > both try to configure the same sub-devices ports.
> > > 	h. When fail-safe finishes with the two sub devices it should release
> > the ports owner.
> > >
> > Ok, this I would describe as different use cases rather than parallel
> > ownership, in that in both cases there is still a single execution context which
> > is responsible for all aspects of a given port (which is fine with me, I'm just
> > trying to be clear in our description of the model).
> > 
> Agree.
> Can you find a realistic scenario that a non-single execution entity must to manage a port and have problems with the port races synchronization management? 
No, nor do I think one exists, just trying to make sure that you weren't trying
to allow for that, as it would be very difficutl to maintain I think.

>  
> > 
> > > > >
> > > > > >
> > > > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > > > That is to say, can one thread remove a device while another
> > > > configures it.
> > > > > > >
> > > > > > > I think it is the same as two threads configuring the same
> > > > > > > device (item 4/ above). It can be managed with port ownership.
> > > > > > >
> > > > > > Only if you assert that application is required to have the
> > > > > > owning port be responsible for the ports deletion, which we can
> > > > > > say, but that leads to the issue above again.
> > > > > >
> > > > > >
> > > > > As Thomas said in item 2 the port creation must be synchronized by
> > > > > ethdev
> > > > and we need to add it there.
> > > > > I think it is obvious that port removal must to be done only by
> > > > > the port
> > > > owner.
> > > > >
> > > > You say that, but its obvious to you as a developer who has looked
> > > > extensively at the code.  It may well be less so to a consumer who
> > > > is not an active member of the community, for instance one who
> > > > obtains the dpdk via pre-built package.
> > > >
> > >
> > > Yes I can understand, but new rules should be documented and be
> > adjusted easy easy by the customers, no?
> > Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
> > instance, What if an application wants to enable packet capture on an
> > interface via rte_pdump_enable?  Does preforming that action require that
> > the execution context which calls that function own the port before doing
> > so?  Digging through the code suggests to me that it (suprisingly) does not,
> > because all that function does is set a socket to record packets too, but I
> > would have intuitively thought that enabling packet capture would require
> > turning off the mac filter table in the hardware, and so would have required
> > ownership
> > 
> > Conversely, using the same example, calling rte_pdump_init, using the
> > model from your last patch, would require that the calling execution context
> > ensured that , at the time the cli application issued the monnitor request,
> > that the port be unowned, because the pdump main thread needs to set
> > rx_tx callbacks on the requested port, which I belive constitutes a
> > configuration change needing port ownership.
> > 
> > My point being, I think saying that ownership is easy and obvious isn't
> > accurate.
> 
> Agree, as a finger rule all the port relation APIs should require ownership taking, but it will take time to learn when we don't need to take ownership.
It likely will, I agree, this is the difficulty in maintaining mutual exclusion
outside of the library,  its (potentially) an every chaning model, for which
documentation needs to keep up, and I'm not sure if we will ever get there
(hence my constant bemoaning of the desire to codify multual exclusion within
the library.  I wonder if it would be worth while to explore a lock registration
mechanism in which sole access is implemented outside the library, but
communicated within the library to avoid this (i.e. a model in which the
application manipulates a mutual exclusion condition, but that can be checked by
the library to ensure proper usage.

> 
> >  If we are to leave proper synchrnization of access to devices up to
> > the application, we either need to:
> > 
> > 1) Assume downstream users are intimately familiar with the code or
> > 2) Exhaustively document the conditions under which ownership needs to be
> > held
> > 
> > (1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
> > are willing to codify synchronization in the code explicitly (via locking), (2) is
> > what we have to do.
> > 
> Agree.
> Maybe it will be good to document each relevant API if it requires ownership taking or not in .h files, what do you think?  
> 
If you think thats a surmountable task, yes.  I think its necessecary if we are
to expect applications to do any sort of real locking (beyond just guaranteeing
one task is executing in the dpdk at any one time)

Best
Neil

> > Neil
> 
> 


More information about the dev mailing list