[dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership

Matan Azrad matan at mellanox.com
Thu Jan 18 16:00:13 CET 2018



From: Ananyev, Konstantin, Thursday, January 18, 2018 4:52 PM
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan at mellanox.com]
> > Sent: Thursday, January 18, 2018 2:45 PM
> > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Thomas
> > Monjalon <thomas at monjalon.net>; Gaetan Rivet
> <gaetan.rivet at 6wind.com>;
> > Wu, Jingjing <jingjing.wu at intel.com>
> > Cc: dev at dpdk.org; Neil Horman <nhorman at tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson at intel.com>
> > Subject: RE: [PATCH v2 2/6] ethdev: add port ownership
> >
> > HI
> >
> > From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > > > Hi Konstantine
> > > >
> > > > > Hi Matan,
> > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > > > It is a public function
> > > > > > > > > > > > > > > > > > > > > > > > used by drivers, so need
> > > > > > > > > > > > > > > > > > > > > > > > to be protected
> > > > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should
> > > > > > > > > > > > > > > > > > > > > > be protected by the
> > > > > > > > > > > > same
> > > > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I don't think so, I can
> > > > > > > > > > > > > > > > > > > > > understand to use the ownership
> > > > > > > > > > > > > > > > > > > > > lock here(as in port
> > > > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary
> too.
> > > > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > > > timing?(ask in the next moment
> > > > > > > > > > > > > > > > > > > > > and you may get another
> > > > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > > > crash.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > > > identifies
> > > > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > > > release synchronization will be
> > > > > > > > > > > > > > > > > > > > managed by
> > > ethdev."
> > > > > > > > > > > > > > > > > > > > To me it means that ethdev layer
> > > > > > > > > > > > > > > > > > > > has to make sure that all accesses
> > > > > > > > > > > > > > > > > > > > to rte_eth_dev_data[].name are
> > > > > > > > atomic.
> > > > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > > > name,
> > > > > ...) ?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The second will get True or False and that is
> it.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Under race condition - in the worst
> > > > > > > > > > > > > > > > > > case it might crash, though for that
> > > > > > > > > > > > > > > > > > you'll have to be really
> > > > > unlucky.
> > > > > > > > > > > > > > > > > > Though in most cases as you said it
> > > > > > > > > > > > > > > > > > would just not operate
> > > > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > > > I think if we start to protect
> > > > > > > > > > > > > > > > > > dev->name by lock we need to do it for
> > > > > > > > > > > > > > > > > > all instances (both read and
> > > > > write).
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Since under the ownership rules, the
> > > > > > > > > > > > > > > > > user must take ownership
> > > > > > > > > > of a
> > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So? The user still should take ownership of
> > > > > > > > > > > > > > > a device before using it
> > > > > > > > > > (by
> > > > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > > > It can just read it without owning it, but no
> managing it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Let say thread 0 doing
> > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocat
> > > > > > > > > > > > > > > > ed()
> > > > > > > > > > > > > > > > -
> > > > > > > > >strcmp().
> > > > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > > > return
> > > > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > > > Which wrong device do you mean? I guess it
> > > > > > > > > > > > > > > is the device which
> > > > > > > > > > > > currently is
> > > > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > > > rte_free() for related resources, while It
> > > > > > > > > > > > > > > > can still be in use by someone
> > > > > > > else.
> > > > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK
> > > > > > > > > > > > > > > entity) must take
> > > > > > > > > > > > ownership
> > > > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > > > release), so
> > > > > > > > > > > > > > no issue here.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > > > Suppose we have a process it created ring port
> > > > > > > > > > > > > > for itself (without
> > > > > > > > > > setting
> > > > > > > > > > > > any
> > > > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > > > for it.
> > > > > > > > > > > > > > At the same time second process decides to
> > > > > > > > > > > > > > call
> > > > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > > > (let
> > > > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > > > They could collide trying to read (process 0)
> > > > > > > > > > > > > > and modify (process 1)
> > > > > > > > > > same
> > > > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Do you mean that process 0 will compare
> > > > > > > > > > > > > successfully the process 1
> > > > > > > > > > new
> > > > > > > > > > > > port name?
> > > > > > > > > > > >
> > > > > > > > > > > > Yes.
> > > > > > > > > > > >
> > > > > > > > > > > > > The state are in local process memory - so
> > > > > > > > > > > > > process 0 will not compare
> > > > > > > > > > the
> > > > > > > > > > > > process 1 port, from its point of view this port
> > > > > > > > > > > > is in UNUSED
> > > > > > > > > > > > > state.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > > > >
> > > > > > > > > > > Someone in process 0 should attach it using
> > > > > > > > > > > protected attach_secondary
> > > > > > > > > > somewhere in your scenario.
> > > > > > > > > >
> > > > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > > > See the function with inline comments:
> > > > > > > > >
> > > > > > > > > struct rte_eth_dev *
> > > > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > > > 	unsigned i;
> > > > > > > > >
> > > > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > > > >
> > > > > > > > > 	    	The below state are in local process memory,
> > > > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > > > current i),
> > > > > > > > update its local state to ATTACHED and write the name,
> > > > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > > > process
> > > > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> > > must
> > > > > > > > take the lock
> > > > > > > > > and it can't, because it is currently locked by process 1.
> > > > > > > >
> > > > > > > > Ok I see.
> > > > > > > > Thanks for your patience.
> > > > > > > > BTW, that means that if let say process 0 will call
> > > > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id
> > > > > > > > be used for different devices and 2 processes will
> > > > > > > > overwrite the same
> > > > > > > rte_eth_dev_data[port_id]?
> > > > > > >
> > > > > > > No, contrary to the state, the lock itself is in shared
> > > > > > > memory, so 2 processes cannot allocate port in the same
> > > > > > > time.(you can see it in the next patch of this series).
> > > > >
> > > > > I am not talking about racing here.
> > > > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > > > >rte_eth_dev_allocate("xxx")
> > > > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it
> > > > > assigns new dev ("xxx") to port N.
> > > > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > > > >rte_eth_dev_allocate("yyy").
> > > > > From its perspective port N is still free:
> > > > > rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED, so it will
> > > > > assign new dev ("yyy") to the same
> > > port.
> > > > >
> > > >
> > > > Yes you right, this is a problem(not related actually to port
> > > > ownership)
> > >
> > > Yep that's true - it was there before your patches.
> > >
> > > > but look:
> > > > As I understand the secondary processes are not allowed to create
> > > > a ports and they must to use attach_secondary API, but there is
> > > > not
> > > hardcoded check which prevent them to do it.
> > >
> > > Secondary processes ae the ability to allocate their own vdevs and
> > > probably it should stay like that.
> > > I just thought it is a good opportunity to fix it while you are on
> > > these changes anyway, but ok we can leave it for now.
> > >
> > Looks like the fix should break ABI(moving the state to the shared
> > memory), let's try to fix it in the next version :)
> 
> Not necessarily - I think we can just  add a check inside
> te_eth_dev_find_free_port() that rte_eth_dev_data[port_id].name is an
> empty string.

Good idea, I will add it (actually the first patch in this series allows it).

Thanks.


> Konstantin
> 
> 
> >
> > > Konstantin


More information about the dev mailing list