[dpdk-dev] [PATCH v4 05/12] net/failsafe: add plug-in support

Stephen Hemminger stephen at networkplumber.org
Mon Jun 5 17:25:22 CEST 2017


On Mon, 5 Jun 2017 01:09:19 +0200
Gaëtan Rivet <gaetan.rivet at 6wind.com> wrote:

> On Thu, Jun 01, 2017 at 11:00:10AM -0700, Stephen Hemminger wrote:
> > On Thu, 1 Jun 2017 16:12:41 +0200
> > Gaëtan Rivet <gaetan.rivet at 6wind.com> wrote:
> >   
> > > On Wed, May 31, 2017 at 08:15:26AM -0700, Stephen Hemminger wrote:  
> > > > On Mon, 29 May 2017 15:42:17 +0200
> > > > Gaetan Rivet <gaetan.rivet at 6wind.com> wrote:
> > > >     
> > > > > Periodically check for the existence of a device.
> > > > > If a device has not been initialized and exists on the system, then it
> > > > > is probed and configured.
> > > > > 
> > > > > The configuration process strives to synchronize the states between the
> > > > > plugged-in sub-device and the fail-safe device.    
> > > > 
> > > > There are existing event models (udev and netlink) that could be used to
> > > > do plug-in support without polling. Polling relies on application doing
> > > > rte_alarms and many don't.    
> > > 
> > > Indeed. This possibility arose during development.
> > > 
> > > The main issue with it however is that it introduces an asynchronous
> > > design, which the DPDK and PMDs underneath are not well-suited to
> > > interact with. It goes against the grain in a way.
> > > 
> > > The polling is simple. It can work with all models of device and is
> > > independent of event models specific to any architecture.
> > > 
> > > It also allows to simplify the contexts in which probing and
> > > removal are done. Currently there is only one, the interrupt thread.
> > > This solves a few possible race conditions without having to resort to
> > > critical sections.
> > > 
> > > The only dependency is on another DPDK subsystem, rte_alarm.
> > > I used alarms here because rte_timers need regular rte_timer_manage()
> > > calls and there is little way to guarantee the frequency of the calls.
> > > 
> > > rte_alarms do not force any externalities on applications, thus allowing a
> > > seamless use of the fail-safe.
> > >   
> > 
> > 
> > The issue with rte_alarm and also with LSC interrupt callbacks is that
> > they don't run on a normal DPDK EAL application thread. These callbacks
> > run on a DPDK internal pthread. I remember having to do some application
> > hacks like having the callback generate an internal event on a pipe.
> >   
> 
> On the other hand, not all applications would make use of those hacks,
> and adding those would impose architecture elements on users. While
> convenient, this goes somewhat against the tool-box ethos of DPDK.
> 
> In the end, I had to leverage the existing tools. Interrupts in DPDK are
> a known weak point, but they are at least working and not too heavy
> conceptually on applications (clean threading model, no need for signal
> masks, etc). Better implementation might crop up at some point, if those
> hurdles are too much and shared by many.
> 

The alarm solution is a good intermediate step. But eventually in the spirit
of the DPDK there should be option to have an event driven model. Maybe the event
library will help.

For me the litmus test is can the known open source heavy weight DPDK applications
like VPP work?


More information about the dev mailing list