[dpdk-dev] Service lcores and Application lcores

Van Haaren, Harry harry.van.haaren at intel.com
Fri Jun 30 15:16:44 CEST 2017


> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Friday, June 30, 2017 2:04 PM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>
> Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Wiles, Keith
> <keith.wiles at intel.com>; Richardson, Bruce <bruce.richardson at intel.com>
> Subject: Re: Service lcores and Application lcores
> 
> -----Original Message-----
> > Date: Fri, 30 Jun 2017 11:14:39 +0000
> > From: "Van Haaren, Harry" <harry.van.haaren at intel.com>
> > To: Thomas Monjalon <thomas at monjalon.net>
> > CC: "dev at dpdk.org" <dev at dpdk.org>, 'Jerin Jacob'
> >  <jerin.jacob at caviumnetworks.com>, "Wiles, Keith" <keith.wiles at intel.com>,
> >  "Richardson, Bruce" <bruce.richardson at intel.com>
> > Subject: RE: Service lcores and Application lcores
> >
> > > From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > > Sent: Friday, June 30, 2017 11:39 AM
> > > To: Van Haaren, Harry <harry.van.haaren at intel.com>
> > > Cc: dev at dpdk.org; 'Jerin Jacob' <jerin.jacob at caviumnetworks.com>; Wiles, Keith
> > > <keith.wiles at intel.com>; Richardson, Bruce <bruce.richardson at intel.com>
> > > Subject: Re: Service lcores and Application lcores
> > >
> > > 30/06/2017 12:18, Van Haaren, Harry:
> > > > From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > > > > 30/06/2017 10:52, Van Haaren, Harry:
> > > > > > From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > > > > > > 29/06/2017 18:35, Van Haaren, Harry:
> > > > > > > > 3) The problem;
> > > > > > > >    If a service core runs the SW PMD schedule() function (option 2) *AND*
> > > > > > > >    the application lcore runs schedule() func (option 1), the result is that
> > > > > > > >    two threads are concurrently running a multi-thread unsafe function.
> > > > > > >
> > > > > > > Which function is multi-thread unsafe?
> > > > > >
> > > > > > With the current design, the service-callback does not have to be multi-thread
> safe.
> > > > > > For example, the eventdev SW PMD is not multi-thread safe.
> > > > > >
> > > > > > The service library handles serializing access to the service-callback if
> multiple
> > > cores
> > > > > > are mapped to that service. This keeps the atomic complexity in one place, and
> keeps
> > > > > > services as light-weight to implement as possible.
> > > > > >
> > > > > > (We could consider forcing all service-callbacks to be multi-thread safe by
> using
> > > > > atomics,
> > > > > > but we would not be able to optimize away the atomic cmpset if it is not
> required.
> > > This
> > > > > > feels heavy handed, and would cause useless atomic ops to execute.)
> > > > >
> > > > > OK thank you for the detailed explanation.
> > > > >
> > > > > > > Why the same function would be run by the service and by the scheduler?
> > > > > >
> > > > > > The same function can be run concurrently by the application, and a service
> core.
> > > > > > The root cause that this could happen is that an application can *think* it is
> the
> > > > > > only one running threads, but in reality one or more service-cores may be
> running
> > > > > > in the background.
> > > > > >
> > > > > > The service lcores and application lcores existence without knowledge of the
> others
> > > > > > behavior is the cause of concurrent running of the multi-thread unsafe service
> > > function.
> > > > >
> > > > > That's the part I still don't understand.
> > > > > Why an application would run a function on its own core if it is already
> > > > > run as a service? Can we just have a check that the service API exists
> > > > > and that the service is running?
> > > >
> > > > The point is that really it is an application / service core mis-match.
> > > > The application should never run a PMD that it knows also has a service core running
> it.
> > >
> > > Yes
> > >
> > > > However, porting applications to the service-core API has an over-lap time where an
> > > > application on 17.05 will be required to call eg: rte_eventdev_schedule() itself,
> and
> > > > depending on startup EAL flags for service-cores, it may-or-may-not have to call
> > > schedule() manually.
> > >
> > > Yes service cores may be unavailable, depending of user configuration.
> > > That's why it must be possible to request the service core API
> > > to know whether a service is run or not.
> >
> > Yep - an application can check if a service is running by calling
> rte_service_is_running(struct service_spec*);
> > It returns true if a service-core is running, mapped to the service, and the service is
> start()-ed.
> 
> If I understand it correctly, driver should check the the _required_
> service has been running or not ? Not the _application_. Right?

I think the PMD should check if a service core is mapped, and it can print a warning if not.
In the case of eventdev, the eventdev_start() is the function where service_is_running() is checked, and if not, we inform the user that no service-core is ready to run the service.

>From the application POV, it could use e.g. the rte_service_iterate()* to run that service - so the PMD should not fail to start(), just warn that at time of starting there was no core available to it. The application itself must still check if it should call rte_eventdev_schedule() itself, based on rte_version.h as Thomas mentioned. 


The ideal end goal is in my opinion something like this;
Service cores are used to run services by 95+% of apps, to abstract away SW/HW core-requirement differences. 
Advanced applications can utilize rte_service_iterate() to run specific services on application lcores if it wishes.


* See other "branch" of this thread about rte_service_iterate()
    http://dpdk.org/ml/archives/dev/2017-June/069540.html


> > > When porting an application to service core, you just have to run this
> > > check, which is known to be available for DPDK 17.08 (check rte_version.h).
> >
> > Ok, so as part of porting to service-cores, applications are expected to sanity check
> the services vs their own lcore config.
> > If there's no disagreement, I will add it to the releases notes of the V+1 service-cores
> patchset.
> >
> > There is still a need for the rte_service_iterate() function as discussed in the other
> branch of this thread.
> > I'll wait for consensus on that and post the next revision then.
> >
> > Thanks for the questions / input!
> >
> >
> > > > This is pretty error prone, and mis-configuration would cause A) deadlock due to no
> CPU
> > > cycles, B) segfault due to two cores.


More information about the dev mailing list