[dpdk-dev] [PATCH] Scheduler: add driver for scheduler crypto pmd

Neil Horman nhorman at tuxdriver.com
Thu Dec 8 15:57:28 CET 2016


On Wed, Dec 07, 2016 at 04:04:17PM +0000, Declan Doherty wrote:
> On 07/12/16 14:46, Richardson, Bruce wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Wednesday, December 7, 2016 2:17 PM
> > > To: Doherty, Declan <declan.doherty at intel.com>
> > > Cc: Richardson, Bruce <bruce.richardson at intel.com>; Thomas Monjalon
> > > <thomas.monjalon at 6wind.com>; Zhang, Roy Fan <roy.fan.zhang at intel.com>;
> > > dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] Scheduler: add driver for scheduler crypto
> > > pmd
> > > 
> > > On Wed, Dec 07, 2016 at 12:42:15PM +0000, Declan Doherty wrote:
> > > > On 05/12/16 15:12, Neil Horman wrote:
> > > > > On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:
> > > > > > On 02/12/16 14:57, Bruce Richardson wrote:
> > > > > > > On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
> > > > > > > > 2016-12-02 14:15, Fan Zhang:
> > > > > > > > > This patch provides the initial implementation of the
> > > > > > > > > scheduler poll mode driver using DPDK cryptodev framework.
> > > > > > > > > 
> > > > > > > > > Scheduler PMD is used to schedule and enqueue the crypto ops
> > > > > > > > > to the hardware and/or software crypto devices attached to
> > > > > > > > > it (slaves). The dequeue operation from the slave(s), and
> > > > > > > > > the possible dequeued crypto op reordering, are then carried
> > > out by the scheduler.
> > > > > > > > > 
> > > > > > > > > The scheduler PMD can be used to fill the throughput gap
> > > > > > > > > between the physical core and the existing cryptodevs to
> > > > > > > > > increase the overall performance. For example, if a physical
> > > > > > > > > core has higher crypto op processing rate than a cryptodev,
> > > > > > > > > the scheduler PMD can be introduced to attach more than one
> > > cryptodevs.
> > > > > > > > > 
> > > > > > > > > This initial implementation is limited to supporting the
> > > > > > > > > following scheduling modes:
> > > > > > > > > 
> > > > > > > > > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst
> > > attached software
> > > > > > > > >     slave cryptodevs, to set this mode, the scheduler should
> > > have been
> > > > > > > > >     attached 1 or more software cryptodevs.
> > > > > > > > > 
> > > > > > > > > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst
> > > attached hardware
> > > > > > > > >     slave cryptodevs (QAT), to set this mode, the scheduler
> > > should have
> > > > > > > > >     been attached 1 or more QATs.
> > > > > > > > 
> > > > > > > > Could it be implemented on top of the eventdev API?
> > > > > > > > 
> > > > > > > Not really. The eventdev API is for different types of
> > > > > > > scheduling between multiple sources that are all polling for
> > > > > > > packets, compared to this, which is more analgous - as I
> > > > > > > understand it - to the bonding PMD for ethdev.
> > > > > > > 
> > > > > > > To make something like this work with an eventdev API you would
> > > > > > > need to use one of the following models:
> > > > > > > * have worker cores for offloading packets to the different crypto
> > > > > > >   blocks pulling from the eventdev APIs. This would make it
> > > difficult to
> > > > > > >   do any "smart" scheduling of crypto operations between the
> > > blocks,
> > > > > > >   e.g. that one crypto instance may be better at certain types of
> > > > > > >   operations than another.
> > > > > > > * move the logic in this driver into an existing eventdev
> > > instance,
> > > > > > >   which uses the eventdev api rather than the crypto APIs and so
> > > has an
> > > > > > >   extra level of "structure abstraction" that has to be worked
> > > though.
> > > > > > >   It's just not really a good fit.
> > > > > > > 
> > > > > > > So for this workload, I believe the pseudo-cryptodev instance is
> > > > > > > the best way to go.
> > > > > > > 
> > > > > > > /Bruce
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > As Bruce says this is much more analogous to the ethdev bonding
> > > > > > driver, the main idea is to allow different crypto op scheduling
> > > > > > mechanisms to be defined transparently to an application. This
> > > > > > could be load-balancing across multiple hw crypto devices, or
> > > > > > having a software crypto device to act as a backup device for a hw
> > > > > > accelerator if it becomes oversubscribed. I think the main
> > > > > > advantage of a crypto-scheduler approach means that the data path
> > > > > > of the application doesn't need to have any knowledge that
> > > > > > scheduling is happening at all, it is just using a different crypto
> > > device id, which is then manages the distribution of crypto work.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > This is a good deal like the bonding pmd, and so from a certain
> > > > > standpoint it makes sense to do this, but whereas the bonding pmd is
> > > > > meant to create a single path to a logical network over several
> > > > > physical networks, this pmd really only focuses on maximizing
> > > > > througput, and for that we already have tools.  As Thomas mentions,
> > > > > there is the eventdev library, but from my view the distributor
> > > > > library already fits this bill.  It already is a basic framework to
> > > > > process mbufs in parallel according to whatever policy you want to
> > > implement, which sounds like exactly what the goal of this pmd is.
> > > > > 
> > > > > Neil
> > > > > 
> > > > > 
> > > > 
> > > > Hey Neil,
> > > > 
> > > > this is actually intended to act and look a good deal like the
> > > > ethernet bonding device but to handling the crypto scheduling use cases.
> > > > 
> > > > For example, take the case where multiple hw accelerators may be
> > > available.
> > > > We want to provide user applications with a mechanism to transparently
> > > > balance work across all devices without having to manage the load
> > > > balancing details or the guaranteeing of ordering of the processed ops
> > > > on the dequeue_burst side. In this case the application would just use
> > > > the crypto dev_id of the scheduler and it would look after balancing
> > > > the workload across the available hw accelerators.
> > > > 
> > > > 
> > > > +-------------------+
> > > > |  Crypto Sch PMD   |
> > > > |                   |
> > > > | ORDERING / RR SCH |
> > > > +-------------------+
> > > >         ^ ^ ^
> > > >         | | |
> > > >       +-+ | +-------------------------------+
> > > >       |   +---------------+                 |
> > > >       |                   |                 |
> > > >       V                   V                 V
> > > > +---------------+ +---------------+ +---------------+
> > > > | Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD |
> > > > +---------------+ +---------------+ +---------------+
> > > > 
> > > > Another use case we hope to support is migration of processing from
> > > > one device to another where a hw and sw crypto pmd can be bound to the
> > > > same crypto scheduler and the crypto processing could be
> > > > transparently migrated from the hw to sw pmd. This would allow for hw
> > > > accelerators to be hot-plugged attached/detached in a Guess VM
> > > > 
> > > > +----------------+
> > > > | Crypto Sch PMD |
> > > > |                |
> > > > | MIGRATION SCH  |
> > > > +----------------+
> > > >       | |
> > > >       | +-----------------+
> > > >       |                   |
> > > >       V                   V
> > > > +---------------+ +---------------+
> > > > | Crypto HW PMD | | Crypto SW PMD |
> > > > |   (Active)    | |   (Inactive)  |
> > > > +---------------+ +---------------+
> > > > 
> > > > The main point is that isn't envisaged as just a mechanism for
> > > > scheduling crypto work loads across multiple cores, but a framework
> > > > for allowing different scheduling mechanisms to be introduced, to
> > > > handle different crypto scheduling problems, and done so in a way
> > > > which  is completely transparent to the data path of an application.
> > > > Like the eth bonding driver we want to support creating the crypto
> > > > scheduler from EAL options, which allow specification of the
> > > > scheduling mode and the crypto pmds which are to be bound to that crypto
> > > scheduler.
> > > > 
> > > > 
> > > I get what its for, that much is pretty clear.  But whereas the bonding
> > > driver benefits from creating a single device interface for the purposes
> > > of properly routing traffic through the network stack without exposing
> > > that complexity to the using application, this pmd provides only
> > > aggregation accoring to various policies.  This is exactly what the
> > > distributor library was built for, and it seems like a re-invention of the
> > > wheel to ignore that.  At the very least, you should implement this pmd on
> > > top of the distributor library.  If that is impracitcal, then I somewhat
> > > question why we have the distributor library at all.
> > > 
> > > Neil
> > > 
> > 
> > Hi Neil,
> > 
> > The distributor library, and the eventdev framework are not the solution here, as, firstly, the crypto devices are not cores, in the same way that ethdev's are not cores, and the distributor library is for evenly distributing work among cores. Sure, some crypto implementations may be software only, but many aren't, and those that are software still appear as a device to software that must be used like they were a HW device. In the same way that to use distributor to load balance traffic between various TX ports is not a suitable solution - because you need to use cores to do the work "bridging" between the distributor/eventdev and the ethdev device, similarly here, if we distribute traffic using the distributor, you need cores to pull those packets from the distributor and offload them to the crypto devices. To use the distributor library in place of this vpmd, we'd need crypto devices which are aware of how to talk to the distributor, and use it's protocols for pushing/pulling packets, or else we are pulling in extra core cycles to do bridging work.
> > 
> > Secondly, the distributor and eventdev libraries are designed for doing flow based (generally atomic) packet distribution. Load balancing between crypto devices is not generally based on flows, but rather on other factors like packet size, offload cost per device, etc. To distributor/eventdev, all workers are equal, but for working with devices, for crypto offload or nic transmission, that is plainly not the case. In short the distribution problems that are being solved by distributor and eventdev libraries are fundamentally different than those being solved by this vpmd. They would be the wrong tool for the job.
> > 
> > I would agree with the previous statements that this driver is far closer in functionality to the bonded ethdev driver than anything else. It makes multiple devices appear as a single one while hiding the complexity of the multiple devices to the using application. In the same way as the bonded ethdev driver has different modes for active-backup, and for active-active for increased throughput, this vpmd for crypto can have the exact same modes - multiple active bonded devices for higher performance operation, or two devices in active backup to enable migration when using SR-IOV as described by Declan above.
> > 
> > Regards,
> > /Bruce
> > 
> 
> I think that having scheduler in the pmd name here may be somewhat of a
> loaded term and is muddying the waters of the problem we are trying to
> address and I think if we were to rename this to crypto_bond_pmd it may make
> our intent for what we want this pmd to achieve clearer.
> 
> Neil, in most of the initial scheduling use cases we want to address with
> this pmd initially, we are looking to schedule within the context of a
> single lcore on multiple hw accelerators or a mix of hw accelerators and sw
> pmds and therefore using the distributor or the eventdev wouldn't add a lot
> of value.
> 
> Declan

Ok, these are fair points, and I'll concede to them.  That said, it still seems
like a waste to me to ignore the 80% functionality overlap to be had here.  That
is to say, the distributor library does alot of work that both this pmd and the
bonding pmd could benefit from.  Perhaps its worth looking at how to enhance the
distributor library such that worker tasks can be affined to a single cpu, and
the worker assignment can be used as indexed device assignment (the idea being
that a single worker task might represent multiple worker ids in the distributor
library).  that way such a crypto aggregator pmd or the bonding pmd's
implementation is little more than setting tags in mbufs accoring to appropriate
policy.

Neil



More information about the dev mailing list