[dpdk-dev] Scheduler: add driver for scheduler crypto pmd

Message ID 1480688123-39494-1-git-send-email-roy.fan.zhang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Pablo de Lara Guarch
Headers

Checks

Context Check Description
checkpatch/checkpatch success coding style OK

Commit Message

Fan Zhang Dec. 2, 2016, 2:15 p.m. UTC
  This patch provides the initial implementation of the scheduler poll mode
driver using DPDK cryptodev framework.

Scheduler PMD is used to schedule and enqueue the crypto ops to the
hardware and/or software crypto devices attached to it (slaves). The
dequeue operation from the slave(s), and the possible dequeued crypto op
reordering, are then carried out by the scheduler.

The scheduler PMD can be used to fill the throughput gap between the
physical core and the existing cryptodevs to increase the overall
performance. For example, if a physical core has higher crypto op
processing rate than a cryptodev, the scheduler PMD can be introduced to
attach more than one cryptodevs.

This initial implementation is limited to supporting the following
scheduling modes:

- CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
    slave cryptodevs, to set this mode, the scheduler should have been
    attached 1 or more software cryptodevs.

- CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
    slave cryptodevs (QAT), to set this mode, the scheduler should have
    been attached 1 or more QATs.

Build instructions:
To build DPDK with CRYTPO_SCHEDULER_PMD the user is required to set
CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER=y in config/common_base

Notice:
Scheduler PMD shares same EAL commandline options as other cryptodevs.
In addition, one extra option "enable_reorder" exists. When it is set to
"yes", the dequeued crypto op reorder will take place. This feature can
be disabled by filling "no" in the "enable_reorder" option. For example,
the following EAL commandline fragment creates a scheduler PMD with
crypto op reordering feature enabled:

... --vdev "crypto_scheduler_pmd,enable_reorder=yes" ...

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 config/common_base                                 |   6 +
 drivers/crypto/Makefile                            |   1 +
 drivers/crypto/scheduler/Makefile                  |  64 +++
 drivers/crypto/scheduler/rte_cryptodev_scheduler.c | 387 +++++++++++++++++
 drivers/crypto/scheduler/rte_cryptodev_scheduler.h |  90 ++++
 .../scheduler/rte_pmd_crypto_scheduler_version.map |   8 +
 drivers/crypto/scheduler/scheduler_pmd.c           | 475 +++++++++++++++++++++
 drivers/crypto/scheduler/scheduler_pmd_ops.c       | 335 +++++++++++++++
 drivers/crypto/scheduler/scheduler_pmd_private.h   | 137 ++++++
 lib/librte_cryptodev/rte_cryptodev.h               |   2 +
 mk/rte.app.mk                                      |   3 +-
 11 files changed, 1507 insertions(+), 1 deletion(-)
 create mode 100644 drivers/crypto/scheduler/Makefile
 create mode 100644 drivers/crypto/scheduler/rte_cryptodev_scheduler.c
 create mode 100644 drivers/crypto/scheduler/rte_cryptodev_scheduler.h
 create mode 100644 drivers/crypto/scheduler/rte_pmd_crypto_scheduler_version.map
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd.c
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd_ops.c
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd_private.h
  

Comments

Thomas Monjalon Dec. 2, 2016, 2:31 p.m. UTC | #1
2016-12-02 14:15, Fan Zhang:
> This patch provides the initial implementation of the scheduler poll mode
> driver using DPDK cryptodev framework.
> 
> Scheduler PMD is used to schedule and enqueue the crypto ops to the
> hardware and/or software crypto devices attached to it (slaves). The
> dequeue operation from the slave(s), and the possible dequeued crypto op
> reordering, are then carried out by the scheduler.
> 
> The scheduler PMD can be used to fill the throughput gap between the
> physical core and the existing cryptodevs to increase the overall
> performance. For example, if a physical core has higher crypto op
> processing rate than a cryptodev, the scheduler PMD can be introduced to
> attach more than one cryptodevs.
> 
> This initial implementation is limited to supporting the following
> scheduling modes:
> 
> - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
>     slave cryptodevs, to set this mode, the scheduler should have been
>     attached 1 or more software cryptodevs.
> 
> - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
>     slave cryptodevs (QAT), to set this mode, the scheduler should have
>     been attached 1 or more QATs.

Could it be implemented on top of the eventdev API?
  
Bruce Richardson Dec. 2, 2016, 2:57 p.m. UTC | #2
On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
> 2016-12-02 14:15, Fan Zhang:
> > This patch provides the initial implementation of the scheduler poll mode
> > driver using DPDK cryptodev framework.
> > 
> > Scheduler PMD is used to schedule and enqueue the crypto ops to the
> > hardware and/or software crypto devices attached to it (slaves). The
> > dequeue operation from the slave(s), and the possible dequeued crypto op
> > reordering, are then carried out by the scheduler.
> > 
> > The scheduler PMD can be used to fill the throughput gap between the
> > physical core and the existing cryptodevs to increase the overall
> > performance. For example, if a physical core has higher crypto op
> > processing rate than a cryptodev, the scheduler PMD can be introduced to
> > attach more than one cryptodevs.
> > 
> > This initial implementation is limited to supporting the following
> > scheduling modes:
> > 
> > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
> >     slave cryptodevs, to set this mode, the scheduler should have been
> >     attached 1 or more software cryptodevs.
> > 
> > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
> >     slave cryptodevs (QAT), to set this mode, the scheduler should have
> >     been attached 1 or more QATs.
> 
> Could it be implemented on top of the eventdev API?
> 
Not really. The eventdev API is for different types of scheduling
between multiple sources that are all polling for packets, compared to
this, which is more analgous - as I understand it - to the bonding PMD
for ethdev.

To make something like this work with an eventdev API you would need to
use one of the following models:
* have worker cores for offloading packets to the different crypto
  blocks pulling from the eventdev APIs. This would make it difficult to
  do any "smart" scheduling of crypto operations between the blocks,
  e.g. that one crypto instance may be better at certain types of
  operations than another.
* move the logic in this driver into an existing eventdev instance,
  which uses the eventdev api rather than the crypto APIs and so has an
  extra level of "structure abstraction" that has to be worked though.
  It's just not really a good fit.

So for this workload, I believe the pseudo-cryptodev instance is the
best way to go.

/Bruce
  
Doherty, Declan Dec. 2, 2016, 4:22 p.m. UTC | #3
On 02/12/16 14:57, Bruce Richardson wrote:
> On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
>> 2016-12-02 14:15, Fan Zhang:
>>> This patch provides the initial implementation of the scheduler poll mode
>>> driver using DPDK cryptodev framework.
>>>
>>> Scheduler PMD is used to schedule and enqueue the crypto ops to the
>>> hardware and/or software crypto devices attached to it (slaves). The
>>> dequeue operation from the slave(s), and the possible dequeued crypto op
>>> reordering, are then carried out by the scheduler.
>>>
>>> The scheduler PMD can be used to fill the throughput gap between the
>>> physical core and the existing cryptodevs to increase the overall
>>> performance. For example, if a physical core has higher crypto op
>>> processing rate than a cryptodev, the scheduler PMD can be introduced to
>>> attach more than one cryptodevs.
>>>
>>> This initial implementation is limited to supporting the following
>>> scheduling modes:
>>>
>>> - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
>>>     slave cryptodevs, to set this mode, the scheduler should have been
>>>     attached 1 or more software cryptodevs.
>>>
>>> - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
>>>     slave cryptodevs (QAT), to set this mode, the scheduler should have
>>>     been attached 1 or more QATs.
>>
>> Could it be implemented on top of the eventdev API?
>>
> Not really. The eventdev API is for different types of scheduling
> between multiple sources that are all polling for packets, compared to
> this, which is more analgous - as I understand it - to the bonding PMD
> for ethdev.
>
> To make something like this work with an eventdev API you would need to
> use one of the following models:
> * have worker cores for offloading packets to the different crypto
>   blocks pulling from the eventdev APIs. This would make it difficult to
>   do any "smart" scheduling of crypto operations between the blocks,
>   e.g. that one crypto instance may be better at certain types of
>   operations than another.
> * move the logic in this driver into an existing eventdev instance,
>   which uses the eventdev api rather than the crypto APIs and so has an
>   extra level of "structure abstraction" that has to be worked though.
>   It's just not really a good fit.
>
> So for this workload, I believe the pseudo-cryptodev instance is the
> best way to go.
>
> /Bruce
>


As Bruce says this is much more analogous to the ethdev bonding driver, 
the main idea is to allow different crypto op scheduling mechanisms to 
be defined transparently to an application. This could be load-balancing 
across multiple hw crypto devices, or having a software crypto device to 
act as a backup device for a hw accelerator if it becomes 
oversubscribed. I think the main advantage of a crypto-scheduler 
approach means that the data path of the application doesn't need to 
have any knowledge that scheduling is happening at all, it is just using 
a different crypto device id, which is then manages the distribution of 
crypto work.
  
Neil Horman Dec. 5, 2016, 3:12 p.m. UTC | #4
On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:
> On 02/12/16 14:57, Bruce Richardson wrote:
> > On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
> > > 2016-12-02 14:15, Fan Zhang:
> > > > This patch provides the initial implementation of the scheduler poll mode
> > > > driver using DPDK cryptodev framework.
> > > > 
> > > > Scheduler PMD is used to schedule and enqueue the crypto ops to the
> > > > hardware and/or software crypto devices attached to it (slaves). The
> > > > dequeue operation from the slave(s), and the possible dequeued crypto op
> > > > reordering, are then carried out by the scheduler.
> > > > 
> > > > The scheduler PMD can be used to fill the throughput gap between the
> > > > physical core and the existing cryptodevs to increase the overall
> > > > performance. For example, if a physical core has higher crypto op
> > > > processing rate than a cryptodev, the scheduler PMD can be introduced to
> > > > attach more than one cryptodevs.
> > > > 
> > > > This initial implementation is limited to supporting the following
> > > > scheduling modes:
> > > > 
> > > > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
> > > >     slave cryptodevs, to set this mode, the scheduler should have been
> > > >     attached 1 or more software cryptodevs.
> > > > 
> > > > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
> > > >     slave cryptodevs (QAT), to set this mode, the scheduler should have
> > > >     been attached 1 or more QATs.
> > > 
> > > Could it be implemented on top of the eventdev API?
> > > 
> > Not really. The eventdev API is for different types of scheduling
> > between multiple sources that are all polling for packets, compared to
> > this, which is more analgous - as I understand it - to the bonding PMD
> > for ethdev.
> > 
> > To make something like this work with an eventdev API you would need to
> > use one of the following models:
> > * have worker cores for offloading packets to the different crypto
> >   blocks pulling from the eventdev APIs. This would make it difficult to
> >   do any "smart" scheduling of crypto operations between the blocks,
> >   e.g. that one crypto instance may be better at certain types of
> >   operations than another.
> > * move the logic in this driver into an existing eventdev instance,
> >   which uses the eventdev api rather than the crypto APIs and so has an
> >   extra level of "structure abstraction" that has to be worked though.
> >   It's just not really a good fit.
> > 
> > So for this workload, I believe the pseudo-cryptodev instance is the
> > best way to go.
> > 
> > /Bruce
> > 
> 
> 
> As Bruce says this is much more analogous to the ethdev bonding driver, the
> main idea is to allow different crypto op scheduling mechanisms to be
> defined transparently to an application. This could be load-balancing across
> multiple hw crypto devices, or having a software crypto device to act as a
> backup device for a hw accelerator if it becomes oversubscribed. I think the
> main advantage of a crypto-scheduler approach means that the data path of
> the application doesn't need to have any knowledge that scheduling is
> happening at all, it is just using a different crypto device id, which is
> then manages the distribution of crypto work.
> 
> 
> 
This is a good deal like the bonding pmd, and so from a certain standpoint it
makes sense to do this, but whereas the bonding pmd is meant to create a single
path to a logical network over several physical networks, this pmd really only
focuses on maximizing througput, and for that we already have tools.  As Thomas
mentions, there is the eventdev library, but from my view the distributor
library already fits this bill.  It already is a basic framework to process
mbufs in parallel according to whatever policy you want to implement, which
sounds like exactly what the goal of this pmd is.  

Neil
  
Doherty, Declan Dec. 7, 2016, 12:42 p.m. UTC | #5
On 05/12/16 15:12, Neil Horman wrote:
> On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:
>> On 02/12/16 14:57, Bruce Richardson wrote:
>>> On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
>>>> 2016-12-02 14:15, Fan Zhang:
>>>>> This patch provides the initial implementation of the scheduler poll mode
>>>>> driver using DPDK cryptodev framework.
>>>>>
>>>>> Scheduler PMD is used to schedule and enqueue the crypto ops to the
>>>>> hardware and/or software crypto devices attached to it (slaves). The
>>>>> dequeue operation from the slave(s), and the possible dequeued crypto op
>>>>> reordering, are then carried out by the scheduler.
>>>>>
>>>>> The scheduler PMD can be used to fill the throughput gap between the
>>>>> physical core and the existing cryptodevs to increase the overall
>>>>> performance. For example, if a physical core has higher crypto op
>>>>> processing rate than a cryptodev, the scheduler PMD can be introduced to
>>>>> attach more than one cryptodevs.
>>>>>
>>>>> This initial implementation is limited to supporting the following
>>>>> scheduling modes:
>>>>>
>>>>> - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
>>>>>     slave cryptodevs, to set this mode, the scheduler should have been
>>>>>     attached 1 or more software cryptodevs.
>>>>>
>>>>> - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
>>>>>     slave cryptodevs (QAT), to set this mode, the scheduler should have
>>>>>     been attached 1 or more QATs.
>>>>
>>>> Could it be implemented on top of the eventdev API?
>>>>
>>> Not really. The eventdev API is for different types of scheduling
>>> between multiple sources that are all polling for packets, compared to
>>> this, which is more analgous - as I understand it - to the bonding PMD
>>> for ethdev.
>>>
>>> To make something like this work with an eventdev API you would need to
>>> use one of the following models:
>>> * have worker cores for offloading packets to the different crypto
>>>   blocks pulling from the eventdev APIs. This would make it difficult to
>>>   do any "smart" scheduling of crypto operations between the blocks,
>>>   e.g. that one crypto instance may be better at certain types of
>>>   operations than another.
>>> * move the logic in this driver into an existing eventdev instance,
>>>   which uses the eventdev api rather than the crypto APIs and so has an
>>>   extra level of "structure abstraction" that has to be worked though.
>>>   It's just not really a good fit.
>>>
>>> So for this workload, I believe the pseudo-cryptodev instance is the
>>> best way to go.
>>>
>>> /Bruce
>>>
>>
>>
>> As Bruce says this is much more analogous to the ethdev bonding driver, the
>> main idea is to allow different crypto op scheduling mechanisms to be
>> defined transparently to an application. This could be load-balancing across
>> multiple hw crypto devices, or having a software crypto device to act as a
>> backup device for a hw accelerator if it becomes oversubscribed. I think the
>> main advantage of a crypto-scheduler approach means that the data path of
>> the application doesn't need to have any knowledge that scheduling is
>> happening at all, it is just using a different crypto device id, which is
>> then manages the distribution of crypto work.
>>
>>
>>
> This is a good deal like the bonding pmd, and so from a certain standpoint it
> makes sense to do this, but whereas the bonding pmd is meant to create a single
> path to a logical network over several physical networks, this pmd really only
> focuses on maximizing througput, and for that we already have tools.  As Thomas
> mentions, there is the eventdev library, but from my view the distributor
> library already fits this bill.  It already is a basic framework to process
> mbufs in parallel according to whatever policy you want to implement, which
> sounds like exactly what the goal of this pmd is.
>
> Neil
>
>

Hey Neil,

this is actually intended to act and look a good deal like the ethernet 
bonding device but to handling the crypto scheduling use cases.

For example, take the case where multiple hw accelerators may be 
available. We want to provide user applications with a mechanism to 
transparently balance work across all devices without having to manage 
the load balancing details or the guaranteeing of ordering of the 
processed ops on the dequeue_burst side. In this case the application 
would just use the crypto dev_id of the scheduler and it would look 
after balancing the workload across the available hw accelerators.


+-------------------+
|  Crypto Sch PMD   |
|                   |
| ORDERING / RR SCH |
+-------------------+
         ^ ^ ^
         | | |
       +-+ | +-------------------------------+
       |   +---------------+                 |
       |                   |                 |
       V                   V                 V
+---------------+ +---------------+ +---------------+
| Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD |
+---------------+ +---------------+ +---------------+

Another use case we hope to support is migration of processing from one 
device to another where a hw and sw crypto pmd can be bound to the same 
crypto scheduler and the crypto processing could be  transparently 
migrated from the hw to sw pmd. This would allow for hw accelerators to 
be hot-plugged attached/detached in a Guess VM

+----------------+
| Crypto Sch PMD |
|                |
| MIGRATION SCH  |
+----------------+
       | |
       | +-----------------+
       |                   |
       V                   V
+---------------+ +---------------+
| Crypto HW PMD | | Crypto SW PMD |
|   (Active)    | |   (Inactive)  |
+---------------+ +---------------+

The main point is that isn't envisaged as just a mechanism for 
scheduling crypto work loads across multiple cores, but a framework for 
allowing different scheduling mechanisms to be introduced, to handle 
different crypto scheduling problems, and done so in a way which  is 
completely transparent to the data path of an application. Like the eth 
bonding driver we want to support creating the crypto scheduler from EAL 
options, which allow specification of the scheduling mode and the crypto 
pmds which are to be bound to that crypto scheduler.
  
Neil Horman Dec. 7, 2016, 2:16 p.m. UTC | #6
On Wed, Dec 07, 2016 at 12:42:15PM +0000, Declan Doherty wrote:
> On 05/12/16 15:12, Neil Horman wrote:
> > On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:
> > > On 02/12/16 14:57, Bruce Richardson wrote:
> > > > On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
> > > > > 2016-12-02 14:15, Fan Zhang:
> > > > > > This patch provides the initial implementation of the scheduler poll mode
> > > > > > driver using DPDK cryptodev framework.
> > > > > > 
> > > > > > Scheduler PMD is used to schedule and enqueue the crypto ops to the
> > > > > > hardware and/or software crypto devices attached to it (slaves). The
> > > > > > dequeue operation from the slave(s), and the possible dequeued crypto op
> > > > > > reordering, are then carried out by the scheduler.
> > > > > > 
> > > > > > The scheduler PMD can be used to fill the throughput gap between the
> > > > > > physical core and the existing cryptodevs to increase the overall
> > > > > > performance. For example, if a physical core has higher crypto op
> > > > > > processing rate than a cryptodev, the scheduler PMD can be introduced to
> > > > > > attach more than one cryptodevs.
> > > > > > 
> > > > > > This initial implementation is limited to supporting the following
> > > > > > scheduling modes:
> > > > > > 
> > > > > > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst attached software
> > > > > >     slave cryptodevs, to set this mode, the scheduler should have been
> > > > > >     attached 1 or more software cryptodevs.
> > > > > > 
> > > > > > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst attached hardware
> > > > > >     slave cryptodevs (QAT), to set this mode, the scheduler should have
> > > > > >     been attached 1 or more QATs.
> > > > > 
> > > > > Could it be implemented on top of the eventdev API?
> > > > > 
> > > > Not really. The eventdev API is for different types of scheduling
> > > > between multiple sources that are all polling for packets, compared to
> > > > this, which is more analgous - as I understand it - to the bonding PMD
> > > > for ethdev.
> > > > 
> > > > To make something like this work with an eventdev API you would need to
> > > > use one of the following models:
> > > > * have worker cores for offloading packets to the different crypto
> > > >   blocks pulling from the eventdev APIs. This would make it difficult to
> > > >   do any "smart" scheduling of crypto operations between the blocks,
> > > >   e.g. that one crypto instance may be better at certain types of
> > > >   operations than another.
> > > > * move the logic in this driver into an existing eventdev instance,
> > > >   which uses the eventdev api rather than the crypto APIs and so has an
> > > >   extra level of "structure abstraction" that has to be worked though.
> > > >   It's just not really a good fit.
> > > > 
> > > > So for this workload, I believe the pseudo-cryptodev instance is the
> > > > best way to go.
> > > > 
> > > > /Bruce
> > > > 
> > > 
> > > 
> > > As Bruce says this is much more analogous to the ethdev bonding driver, the
> > > main idea is to allow different crypto op scheduling mechanisms to be
> > > defined transparently to an application. This could be load-balancing across
> > > multiple hw crypto devices, or having a software crypto device to act as a
> > > backup device for a hw accelerator if it becomes oversubscribed. I think the
> > > main advantage of a crypto-scheduler approach means that the data path of
> > > the application doesn't need to have any knowledge that scheduling is
> > > happening at all, it is just using a different crypto device id, which is
> > > then manages the distribution of crypto work.
> > > 
> > > 
> > > 
> > This is a good deal like the bonding pmd, and so from a certain standpoint it
> > makes sense to do this, but whereas the bonding pmd is meant to create a single
> > path to a logical network over several physical networks, this pmd really only
> > focuses on maximizing througput, and for that we already have tools.  As Thomas
> > mentions, there is the eventdev library, but from my view the distributor
> > library already fits this bill.  It already is a basic framework to process
> > mbufs in parallel according to whatever policy you want to implement, which
> > sounds like exactly what the goal of this pmd is.
> > 
> > Neil
> > 
> > 
> 
> Hey Neil,
> 
> this is actually intended to act and look a good deal like the ethernet
> bonding device but to handling the crypto scheduling use cases.
> 
> For example, take the case where multiple hw accelerators may be available.
> We want to provide user applications with a mechanism to transparently
> balance work across all devices without having to manage the load balancing
> details or the guaranteeing of ordering of the processed ops on the
> dequeue_burst side. In this case the application would just use the crypto
> dev_id of the scheduler and it would look after balancing the workload
> across the available hw accelerators.
> 
> 
> +-------------------+
> |  Crypto Sch PMD   |
> |                   |
> | ORDERING / RR SCH |
> +-------------------+
>         ^ ^ ^
>         | | |
>       +-+ | +-------------------------------+
>       |   +---------------+                 |
>       |                   |                 |
>       V                   V                 V
> +---------------+ +---------------+ +---------------+
> | Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD |
> +---------------+ +---------------+ +---------------+
> 
> Another use case we hope to support is migration of processing from one
> device to another where a hw and sw crypto pmd can be bound to the same
> crypto scheduler and the crypto processing could be  transparently migrated
> from the hw to sw pmd. This would allow for hw accelerators to be
> hot-plugged attached/detached in a Guess VM
> 
> +----------------+
> | Crypto Sch PMD |
> |                |
> | MIGRATION SCH  |
> +----------------+
>       | |
>       | +-----------------+
>       |                   |
>       V                   V
> +---------------+ +---------------+
> | Crypto HW PMD | | Crypto SW PMD |
> |   (Active)    | |   (Inactive)  |
> +---------------+ +---------------+
> 
> The main point is that isn't envisaged as just a mechanism for scheduling
> crypto work loads across multiple cores, but a framework for allowing
> different scheduling mechanisms to be introduced, to handle different crypto
> scheduling problems, and done so in a way which  is completely transparent
> to the data path of an application. Like the eth bonding driver we want to
> support creating the crypto scheduler from EAL options, which allow
> specification of the scheduling mode and the crypto pmds which are to be
> bound to that crypto scheduler.
> 
> 
I get what its for, that much is pretty clear.  But whereas the bonding driver
benefits from creating a single device interface for the purposes of properly
routing traffic through the network stack without exposing that complexity to
the using application, this pmd provides only aggregation accoring to various
policies.  This is exactly what the distributor library was built for, and it
seems like a re-invention of the wheel to ignore that.  At the very least, you
should implement this pmd on top of the distributor library.  If that is
impracitcal, then I somewhat question why we have the distributor library at
all.

Neil
  
Bruce Richardson Dec. 7, 2016, 2:46 p.m. UTC | #7
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Wednesday, December 7, 2016 2:17 PM
> To: Doherty, Declan <declan.doherty@intel.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Thomas Monjalon
> <thomas.monjalon@6wind.com>; Zhang, Roy Fan <roy.fan.zhang@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Scheduler: add driver for scheduler crypto
> pmd
> 
> On Wed, Dec 07, 2016 at 12:42:15PM +0000, Declan Doherty wrote:
> > On 05/12/16 15:12, Neil Horman wrote:
> > > On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:
> > > > On 02/12/16 14:57, Bruce Richardson wrote:
> > > > > On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
> > > > > > 2016-12-02 14:15, Fan Zhang:
> > > > > > > This patch provides the initial implementation of the
> > > > > > > scheduler poll mode driver using DPDK cryptodev framework.
> > > > > > >
> > > > > > > Scheduler PMD is used to schedule and enqueue the crypto ops
> > > > > > > to the hardware and/or software crypto devices attached to
> > > > > > > it (slaves). The dequeue operation from the slave(s), and
> > > > > > > the possible dequeued crypto op reordering, are then carried
> out by the scheduler.
> > > > > > >
> > > > > > > The scheduler PMD can be used to fill the throughput gap
> > > > > > > between the physical core and the existing cryptodevs to
> > > > > > > increase the overall performance. For example, if a physical
> > > > > > > core has higher crypto op processing rate than a cryptodev,
> > > > > > > the scheduler PMD can be introduced to attach more than one
> cryptodevs.
> > > > > > >
> > > > > > > This initial implementation is limited to supporting the
> > > > > > > following scheduling modes:
> > > > > > >
> > > > > > > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst
> attached software
> > > > > > >     slave cryptodevs, to set this mode, the scheduler should
> have been
> > > > > > >     attached 1 or more software cryptodevs.
> > > > > > >
> > > > > > > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst
> attached hardware
> > > > > > >     slave cryptodevs (QAT), to set this mode, the scheduler
> should have
> > > > > > >     been attached 1 or more QATs.
> > > > > >
> > > > > > Could it be implemented on top of the eventdev API?
> > > > > >
> > > > > Not really. The eventdev API is for different types of
> > > > > scheduling between multiple sources that are all polling for
> > > > > packets, compared to this, which is more analgous - as I
> > > > > understand it - to the bonding PMD for ethdev.
> > > > >
> > > > > To make something like this work with an eventdev API you would
> > > > > need to use one of the following models:
> > > > > * have worker cores for offloading packets to the different crypto
> > > > >   blocks pulling from the eventdev APIs. This would make it
> difficult to
> > > > >   do any "smart" scheduling of crypto operations between the
> blocks,
> > > > >   e.g. that one crypto instance may be better at certain types of
> > > > >   operations than another.
> > > > > * move the logic in this driver into an existing eventdev
> instance,
> > > > >   which uses the eventdev api rather than the crypto APIs and so
> has an
> > > > >   extra level of "structure abstraction" that has to be worked
> though.
> > > > >   It's just not really a good fit.
> > > > >
> > > > > So for this workload, I believe the pseudo-cryptodev instance is
> > > > > the best way to go.
> > > > >
> > > > > /Bruce
> > > > >
> > > >
> > > >
> > > > As Bruce says this is much more analogous to the ethdev bonding
> > > > driver, the main idea is to allow different crypto op scheduling
> > > > mechanisms to be defined transparently to an application. This
> > > > could be load-balancing across multiple hw crypto devices, or
> > > > having a software crypto device to act as a backup device for a hw
> > > > accelerator if it becomes oversubscribed. I think the main
> > > > advantage of a crypto-scheduler approach means that the data path
> > > > of the application doesn't need to have any knowledge that
> > > > scheduling is happening at all, it is just using a different crypto
> device id, which is then manages the distribution of crypto work.
> > > >
> > > >
> > > >
> > > This is a good deal like the bonding pmd, and so from a certain
> > > standpoint it makes sense to do this, but whereas the bonding pmd is
> > > meant to create a single path to a logical network over several
> > > physical networks, this pmd really only focuses on maximizing
> > > througput, and for that we already have tools.  As Thomas mentions,
> > > there is the eventdev library, but from my view the distributor
> > > library already fits this bill.  It already is a basic framework to
> > > process mbufs in parallel according to whatever policy you want to
> implement, which sounds like exactly what the goal of this pmd is.
> > >
> > > Neil
> > >
> > >
> >
> > Hey Neil,
> >
> > this is actually intended to act and look a good deal like the
> > ethernet bonding device but to handling the crypto scheduling use cases.
> >
> > For example, take the case where multiple hw accelerators may be
> available.
> > We want to provide user applications with a mechanism to transparently
> > balance work across all devices without having to manage the load
> > balancing details or the guaranteeing of ordering of the processed ops
> > on the dequeue_burst side. In this case the application would just use
> > the crypto dev_id of the scheduler and it would look after balancing
> > the workload across the available hw accelerators.
> >
> >
> > +-------------------+
> > |  Crypto Sch PMD   |
> > |                   |
> > | ORDERING / RR SCH |
> > +-------------------+
> >         ^ ^ ^
> >         | | |
> >       +-+ | +-------------------------------+
> >       |   +---------------+                 |
> >       |                   |                 |
> >       V                   V                 V
> > +---------------+ +---------------+ +---------------+
> > | Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD |
> > +---------------+ +---------------+ +---------------+
> >
> > Another use case we hope to support is migration of processing from
> > one device to another where a hw and sw crypto pmd can be bound to the
> > same crypto scheduler and the crypto processing could be
> > transparently migrated from the hw to sw pmd. This would allow for hw
> > accelerators to be hot-plugged attached/detached in a Guess VM
> >
> > +----------------+
> > | Crypto Sch PMD |
> > |                |
> > | MIGRATION SCH  |
> > +----------------+
> >       | |
> >       | +-----------------+
> >       |                   |
> >       V                   V
> > +---------------+ +---------------+
> > | Crypto HW PMD | | Crypto SW PMD |
> > |   (Active)    | |   (Inactive)  |
> > +---------------+ +---------------+
> >
> > The main point is that isn't envisaged as just a mechanism for
> > scheduling crypto work loads across multiple cores, but a framework
> > for allowing different scheduling mechanisms to be introduced, to
> > handle different crypto scheduling problems, and done so in a way
> > which  is completely transparent to the data path of an application.
> > Like the eth bonding driver we want to support creating the crypto
> > scheduler from EAL options, which allow specification of the
> > scheduling mode and the crypto pmds which are to be bound to that crypto
> scheduler.
> >
> >
> I get what its for, that much is pretty clear.  But whereas the bonding
> driver benefits from creating a single device interface for the purposes
> of properly routing traffic through the network stack without exposing
> that complexity to the using application, this pmd provides only
> aggregation accoring to various policies.  This is exactly what the
> distributor library was built for, and it seems like a re-invention of the
> wheel to ignore that.  At the very least, you should implement this pmd on
> top of the distributor library.  If that is impracitcal, then I somewhat
> question why we have the distributor library at all.
> 
> Neil
> 

Hi Neil,

The distributor library, and the eventdev framework are not the solution here, as, firstly, the crypto devices are not cores, in the same way that ethdev's are not cores, and the distributor library is for evenly distributing work among cores. Sure, some crypto implementations may be software only, but many aren't, and those that are software still appear as a device to software that must be used like they were a HW device. In the same way that to use distributor to load balance traffic between various TX ports is not a suitable solution - because you need to use cores to do the work "bridging" between the distributor/eventdev and the ethdev device, similarly here, if we distribute traffic using the distributor, you need cores to pull those packets from the distributor and offload them to the crypto devices. To use the distributor library in place of this vpmd, we'd need crypto devices which are aware of how to talk to the distributor, and use it's protocols for pushing/pulling packets, or else we are pulling in extra core cycles to do bridging work.

Secondly, the distributor and eventdev libraries are designed for doing flow based (generally atomic) packet distribution. Load balancing between crypto devices is not generally based on flows, but rather on other factors like packet size, offload cost per device, etc. To distributor/eventdev, all workers are equal, but for working with devices, for crypto offload or nic transmission, that is plainly not the case. In short the distribution problems that are being solved by distributor and eventdev libraries are fundamentally different than those being solved by this vpmd. They would be the wrong tool for the job.

I would agree with the previous statements that this driver is far closer in functionality to the bonded ethdev driver than anything else. It makes multiple devices appear as a single one while hiding the complexity of the multiple devices to the using application. In the same way as the bonded ethdev driver has different modes for active-backup, and for active-active for increased throughput, this vpmd for crypto can have the exact same modes - multiple active bonded devices for higher performance operation, or two devices in active backup to enable migration when using SR-IOV as described by Declan above.

Regards,
/Bruce
  
Doherty, Declan Dec. 7, 2016, 4:04 p.m. UTC | #8
On 07/12/16 14:46, Richardson, Bruce wrote:
>

>

>> -----Original Message-----

>> From: Neil Horman [mailto:nhorman@tuxdriver.com]

>> Sent: Wednesday, December 7, 2016 2:17 PM

>> To: Doherty, Declan <declan.doherty@intel.com>

>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Thomas Monjalon

>> <thomas.monjalon@6wind.com>; Zhang, Roy Fan <roy.fan.zhang@intel.com>;

>> dev@dpdk.org

>> Subject: Re: [dpdk-dev] [PATCH] Scheduler: add driver for scheduler crypto

>> pmd

>>

>> On Wed, Dec 07, 2016 at 12:42:15PM +0000, Declan Doherty wrote:

>>> On 05/12/16 15:12, Neil Horman wrote:

>>>> On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:

>>>>> On 02/12/16 14:57, Bruce Richardson wrote:

>>>>>> On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:

>>>>>>> 2016-12-02 14:15, Fan Zhang:

>>>>>>>> This patch provides the initial implementation of the

>>>>>>>> scheduler poll mode driver using DPDK cryptodev framework.

>>>>>>>>

>>>>>>>> Scheduler PMD is used to schedule and enqueue the crypto ops

>>>>>>>> to the hardware and/or software crypto devices attached to

>>>>>>>> it (slaves). The dequeue operation from the slave(s), and

>>>>>>>> the possible dequeued crypto op reordering, are then carried

>> out by the scheduler.

>>>>>>>>

>>>>>>>> The scheduler PMD can be used to fill the throughput gap

>>>>>>>> between the physical core and the existing cryptodevs to

>>>>>>>> increase the overall performance. For example, if a physical

>>>>>>>> core has higher crypto op processing rate than a cryptodev,

>>>>>>>> the scheduler PMD can be introduced to attach more than one

>> cryptodevs.

>>>>>>>>

>>>>>>>> This initial implementation is limited to supporting the

>>>>>>>> following scheduling modes:

>>>>>>>>

>>>>>>>> - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst

>> attached software

>>>>>>>>     slave cryptodevs, to set this mode, the scheduler should

>> have been

>>>>>>>>     attached 1 or more software cryptodevs.

>>>>>>>>

>>>>>>>> - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst

>> attached hardware

>>>>>>>>     slave cryptodevs (QAT), to set this mode, the scheduler

>> should have

>>>>>>>>     been attached 1 or more QATs.

>>>>>>>

>>>>>>> Could it be implemented on top of the eventdev API?

>>>>>>>

>>>>>> Not really. The eventdev API is for different types of

>>>>>> scheduling between multiple sources that are all polling for

>>>>>> packets, compared to this, which is more analgous - as I

>>>>>> understand it - to the bonding PMD for ethdev.

>>>>>>

>>>>>> To make something like this work with an eventdev API you would

>>>>>> need to use one of the following models:

>>>>>> * have worker cores for offloading packets to the different crypto

>>>>>>   blocks pulling from the eventdev APIs. This would make it

>> difficult to

>>>>>>   do any "smart" scheduling of crypto operations between the

>> blocks,

>>>>>>   e.g. that one crypto instance may be better at certain types of

>>>>>>   operations than another.

>>>>>> * move the logic in this driver into an existing eventdev

>> instance,

>>>>>>   which uses the eventdev api rather than the crypto APIs and so

>> has an

>>>>>>   extra level of "structure abstraction" that has to be worked

>> though.

>>>>>>   It's just not really a good fit.

>>>>>>

>>>>>> So for this workload, I believe the pseudo-cryptodev instance is

>>>>>> the best way to go.

>>>>>>

>>>>>> /Bruce

>>>>>>

>>>>>

>>>>>

>>>>> As Bruce says this is much more analogous to the ethdev bonding

>>>>> driver, the main idea is to allow different crypto op scheduling

>>>>> mechanisms to be defined transparently to an application. This

>>>>> could be load-balancing across multiple hw crypto devices, or

>>>>> having a software crypto device to act as a backup device for a hw

>>>>> accelerator if it becomes oversubscribed. I think the main

>>>>> advantage of a crypto-scheduler approach means that the data path

>>>>> of the application doesn't need to have any knowledge that

>>>>> scheduling is happening at all, it is just using a different crypto

>> device id, which is then manages the distribution of crypto work.

>>>>>

>>>>>

>>>>>

>>>> This is a good deal like the bonding pmd, and so from a certain

>>>> standpoint it makes sense to do this, but whereas the bonding pmd is

>>>> meant to create a single path to a logical network over several

>>>> physical networks, this pmd really only focuses on maximizing

>>>> througput, and for that we already have tools.  As Thomas mentions,

>>>> there is the eventdev library, but from my view the distributor

>>>> library already fits this bill.  It already is a basic framework to

>>>> process mbufs in parallel according to whatever policy you want to

>> implement, which sounds like exactly what the goal of this pmd is.

>>>>

>>>> Neil

>>>>

>>>>

>>>

>>> Hey Neil,

>>>

>>> this is actually intended to act and look a good deal like the

>>> ethernet bonding device but to handling the crypto scheduling use cases.

>>>

>>> For example, take the case where multiple hw accelerators may be

>> available.

>>> We want to provide user applications with a mechanism to transparently

>>> balance work across all devices without having to manage the load

>>> balancing details or the guaranteeing of ordering of the processed ops

>>> on the dequeue_burst side. In this case the application would just use

>>> the crypto dev_id of the scheduler and it would look after balancing

>>> the workload across the available hw accelerators.

>>>

>>>

>>> +-------------------+

>>> |  Crypto Sch PMD   |

>>> |                   |

>>> | ORDERING / RR SCH |

>>> +-------------------+

>>>         ^ ^ ^

>>>         | | |

>>>       +-+ | +-------------------------------+

>>>       |   +---------------+                 |

>>>       |                   |                 |

>>>       V                   V                 V

>>> +---------------+ +---------------+ +---------------+

>>> | Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD |

>>> +---------------+ +---------------+ +---------------+

>>>

>>> Another use case we hope to support is migration of processing from

>>> one device to another where a hw and sw crypto pmd can be bound to the

>>> same crypto scheduler and the crypto processing could be

>>> transparently migrated from the hw to sw pmd. This would allow for hw

>>> accelerators to be hot-plugged attached/detached in a Guess VM

>>>

>>> +----------------+

>>> | Crypto Sch PMD |

>>> |                |

>>> | MIGRATION SCH  |

>>> +----------------+

>>>       | |

>>>       | +-----------------+

>>>       |                   |

>>>       V                   V

>>> +---------------+ +---------------+

>>> | Crypto HW PMD | | Crypto SW PMD |

>>> |   (Active)    | |   (Inactive)  |

>>> +---------------+ +---------------+

>>>

>>> The main point is that isn't envisaged as just a mechanism for

>>> scheduling crypto work loads across multiple cores, but a framework

>>> for allowing different scheduling mechanisms to be introduced, to

>>> handle different crypto scheduling problems, and done so in a way

>>> which  is completely transparent to the data path of an application.

>>> Like the eth bonding driver we want to support creating the crypto

>>> scheduler from EAL options, which allow specification of the

>>> scheduling mode and the crypto pmds which are to be bound to that crypto

>> scheduler.

>>>

>>>

>> I get what its for, that much is pretty clear.  But whereas the bonding

>> driver benefits from creating a single device interface for the purposes

>> of properly routing traffic through the network stack without exposing

>> that complexity to the using application, this pmd provides only

>> aggregation accoring to various policies.  This is exactly what the

>> distributor library was built for, and it seems like a re-invention of the

>> wheel to ignore that.  At the very least, you should implement this pmd on

>> top of the distributor library.  If that is impracitcal, then I somewhat

>> question why we have the distributor library at all.

>>

>> Neil

>>

>

> Hi Neil,

>

> The distributor library, and the eventdev framework are not the solution here, as, firstly, the crypto devices are not cores, in the same way that ethdev's are not cores, and the distributor library is for evenly distributing work among cores. Sure, some crypto implementations may be software only, but many aren't, and those that are software still appear as a device to software that must be used like they were a HW device. In the same way that to use distributor to load balance traffic between various TX ports is not a suitable solution - because you need to use cores to do the work "bridging" between the distributor/eventdev and the ethdev device, similarly here, if we distribute traffic using the distributor, you need cores to pull those packets from the distributor and offload them to the crypto devices. To use the distributor library in place of this vpmd, we'd need crypto devices which are aware of how to talk to the distributor, and use it's protocols for pushing/pulling packets, or else we are pulling in extra core cycles to do bridging work.

>

> Secondly, the distributor and eventdev libraries are designed for doing flow based (generally atomic) packet distribution. Load balancing between crypto devices is not generally based on flows, but rather on other factors like packet size, offload cost per device, etc. To distributor/eventdev, all workers are equal, but for working with devices, for crypto offload or nic transmission, that is plainly not the case. In short the distribution problems that are being solved by distributor and eventdev libraries are fundamentally different than those being solved by this vpmd. They would be the wrong tool for the job.

>

> I would agree with the previous statements that this driver is far closer in functionality to the bonded ethdev driver than anything else. It makes multiple devices appear as a single one while hiding the complexity of the multiple devices to the using application. In the same way as the bonded ethdev driver has different modes for active-backup, and for active-active for increased throughput, this vpmd for crypto can have the exact same modes - multiple active bonded devices for higher performance operation, or two devices in active backup to enable migration when using SR-IOV as described by Declan above.

>

> Regards,

> /Bruce

>


I think that having scheduler in the pmd name here may be somewhat of a 
loaded term and is muddying the waters of the problem we are trying to 
address and I think if we were to rename this to crypto_bond_pmd it may 
make our intent for what we want this pmd to achieve clearer.

Neil, in most of the initial scheduling use cases we want to address 
with this pmd initially, we are looking to schedule within the context 
of a single lcore on multiple hw accelerators or a mix of hw 
accelerators and sw pmds and therefore using the distributor or the 
eventdev wouldn't add a lot of value.

Declan
  
Neil Horman Dec. 8, 2016, 2:57 p.m. UTC | #9
On Wed, Dec 07, 2016 at 04:04:17PM +0000, Declan Doherty wrote:
> On 07/12/16 14:46, Richardson, Bruce wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Wednesday, December 7, 2016 2:17 PM
> > > To: Doherty, Declan <declan.doherty@intel.com>
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Thomas Monjalon
> > > <thomas.monjalon@6wind.com>; Zhang, Roy Fan <roy.fan.zhang@intel.com>;
> > > dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] Scheduler: add driver for scheduler crypto
> > > pmd
> > > 
> > > On Wed, Dec 07, 2016 at 12:42:15PM +0000, Declan Doherty wrote:
> > > > On 05/12/16 15:12, Neil Horman wrote:
> > > > > On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote:
> > > > > > On 02/12/16 14:57, Bruce Richardson wrote:
> > > > > > > On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wrote:
> > > > > > > > 2016-12-02 14:15, Fan Zhang:
> > > > > > > > > This patch provides the initial implementation of the
> > > > > > > > > scheduler poll mode driver using DPDK cryptodev framework.
> > > > > > > > > 
> > > > > > > > > Scheduler PMD is used to schedule and enqueue the crypto ops
> > > > > > > > > to the hardware and/or software crypto devices attached to
> > > > > > > > > it (slaves). The dequeue operation from the slave(s), and
> > > > > > > > > the possible dequeued crypto op reordering, are then carried
> > > out by the scheduler.
> > > > > > > > > 
> > > > > > > > > The scheduler PMD can be used to fill the throughput gap
> > > > > > > > > between the physical core and the existing cryptodevs to
> > > > > > > > > increase the overall performance. For example, if a physical
> > > > > > > > > core has higher crypto op processing rate than a cryptodev,
> > > > > > > > > the scheduler PMD can be introduced to attach more than one
> > > cryptodevs.
> > > > > > > > > 
> > > > > > > > > This initial implementation is limited to supporting the
> > > > > > > > > following scheduling modes:
> > > > > > > > > 
> > > > > > > > > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst
> > > attached software
> > > > > > > > >     slave cryptodevs, to set this mode, the scheduler should
> > > have been
> > > > > > > > >     attached 1 or more software cryptodevs.
> > > > > > > > > 
> > > > > > > > > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst
> > > attached hardware
> > > > > > > > >     slave cryptodevs (QAT), to set this mode, the scheduler
> > > should have
> > > > > > > > >     been attached 1 or more QATs.
> > > > > > > > 
> > > > > > > > Could it be implemented on top of the eventdev API?
> > > > > > > > 
> > > > > > > Not really. The eventdev API is for different types of
> > > > > > > scheduling between multiple sources that are all polling for
> > > > > > > packets, compared to this, which is more analgous - as I
> > > > > > > understand it - to the bonding PMD for ethdev.
> > > > > > > 
> > > > > > > To make something like this work with an eventdev API you would
> > > > > > > need to use one of the following models:
> > > > > > > * have worker cores for offloading packets to the different crypto
> > > > > > >   blocks pulling from the eventdev APIs. This would make it
> > > difficult to
> > > > > > >   do any "smart" scheduling of crypto operations between the
> > > blocks,
> > > > > > >   e.g. that one crypto instance may be better at certain types of
> > > > > > >   operations than another.
> > > > > > > * move the logic in this driver into an existing eventdev
> > > instance,
> > > > > > >   which uses the eventdev api rather than the crypto APIs and so
> > > has an
> > > > > > >   extra level of "structure abstraction" that has to be worked
> > > though.
> > > > > > >   It's just not really a good fit.
> > > > > > > 
> > > > > > > So for this workload, I believe the pseudo-cryptodev instance is
> > > > > > > the best way to go.
> > > > > > > 
> > > > > > > /Bruce
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > As Bruce says this is much more analogous to the ethdev bonding
> > > > > > driver, the main idea is to allow different crypto op scheduling
> > > > > > mechanisms to be defined transparently to an application. This
> > > > > > could be load-balancing across multiple hw crypto devices, or
> > > > > > having a software crypto device to act as a backup device for a hw
> > > > > > accelerator if it becomes oversubscribed. I think the main
> > > > > > advantage of a crypto-scheduler approach means that the data path
> > > > > > of the application doesn't need to have any knowledge that
> > > > > > scheduling is happening at all, it is just using a different crypto
> > > device id, which is then manages the distribution of crypto work.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > This is a good deal like the bonding pmd, and so from a certain
> > > > > standpoint it makes sense to do this, but whereas the bonding pmd is
> > > > > meant to create a single path to a logical network over several
> > > > > physical networks, this pmd really only focuses on maximizing
> > > > > througput, and for that we already have tools.  As Thomas mentions,
> > > > > there is the eventdev library, but from my view the distributor
> > > > > library already fits this bill.  It already is a basic framework to
> > > > > process mbufs in parallel according to whatever policy you want to
> > > implement, which sounds like exactly what the goal of this pmd is.
> > > > > 
> > > > > Neil
> > > > > 
> > > > > 
> > > > 
> > > > Hey Neil,
> > > > 
> > > > this is actually intended to act and look a good deal like the
> > > > ethernet bonding device but to handling the crypto scheduling use cases.
> > > > 
> > > > For example, take the case where multiple hw accelerators may be
> > > available.
> > > > We want to provide user applications with a mechanism to transparently
> > > > balance work across all devices without having to manage the load
> > > > balancing details or the guaranteeing of ordering of the processed ops
> > > > on the dequeue_burst side. In this case the application would just use
> > > > the crypto dev_id of the scheduler and it would look after balancing
> > > > the workload across the available hw accelerators.
> > > > 
> > > > 
> > > > +-------------------+
> > > > |  Crypto Sch PMD   |
> > > > |                   |
> > > > | ORDERING / RR SCH |
> > > > +-------------------+
> > > >         ^ ^ ^
> > > >         | | |
> > > >       +-+ | +-------------------------------+
> > > >       |   +---------------+                 |
> > > >       |                   |                 |
> > > >       V                   V                 V
> > > > +---------------+ +---------------+ +---------------+
> > > > | Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD |
> > > > +---------------+ +---------------+ +---------------+
> > > > 
> > > > Another use case we hope to support is migration of processing from
> > > > one device to another where a hw and sw crypto pmd can be bound to the
> > > > same crypto scheduler and the crypto processing could be
> > > > transparently migrated from the hw to sw pmd. This would allow for hw
> > > > accelerators to be hot-plugged attached/detached in a Guess VM
> > > > 
> > > > +----------------+
> > > > | Crypto Sch PMD |
> > > > |                |
> > > > | MIGRATION SCH  |
> > > > +----------------+
> > > >       | |
> > > >       | +-----------------+
> > > >       |                   |
> > > >       V                   V
> > > > +---------------+ +---------------+
> > > > | Crypto HW PMD | | Crypto SW PMD |
> > > > |   (Active)    | |   (Inactive)  |
> > > > +---------------+ +---------------+
> > > > 
> > > > The main point is that isn't envisaged as just a mechanism for
> > > > scheduling crypto work loads across multiple cores, but a framework
> > > > for allowing different scheduling mechanisms to be introduced, to
> > > > handle different crypto scheduling problems, and done so in a way
> > > > which  is completely transparent to the data path of an application.
> > > > Like the eth bonding driver we want to support creating the crypto
> > > > scheduler from EAL options, which allow specification of the
> > > > scheduling mode and the crypto pmds which are to be bound to that crypto
> > > scheduler.
> > > > 
> > > > 
> > > I get what its for, that much is pretty clear.  But whereas the bonding
> > > driver benefits from creating a single device interface for the purposes
> > > of properly routing traffic through the network stack without exposing
> > > that complexity to the using application, this pmd provides only
> > > aggregation accoring to various policies.  This is exactly what the
> > > distributor library was built for, and it seems like a re-invention of the
> > > wheel to ignore that.  At the very least, you should implement this pmd on
> > > top of the distributor library.  If that is impracitcal, then I somewhat
> > > question why we have the distributor library at all.
> > > 
> > > Neil
> > > 
> > 
> > Hi Neil,
> > 
> > The distributor library, and the eventdev framework are not the solution here, as, firstly, the crypto devices are not cores, in the same way that ethdev's are not cores, and the distributor library is for evenly distributing work among cores. Sure, some crypto implementations may be software only, but many aren't, and those that are software still appear as a device to software that must be used like they were a HW device. In the same way that to use distributor to load balance traffic between various TX ports is not a suitable solution - because you need to use cores to do the work "bridging" between the distributor/eventdev and the ethdev device, similarly here, if we distribute traffic using the distributor, you need cores to pull those packets from the distributor and offload them to the crypto devices. To use the distributor library in place of this vpmd, we'd need crypto devices which are aware of how to talk to the distributor, and use it's protocols for pushing/pulling packets, or else we are pulling in extra core cycles to do bridging work.
> > 
> > Secondly, the distributor and eventdev libraries are designed for doing flow based (generally atomic) packet distribution. Load balancing between crypto devices is not generally based on flows, but rather on other factors like packet size, offload cost per device, etc. To distributor/eventdev, all workers are equal, but for working with devices, for crypto offload or nic transmission, that is plainly not the case. In short the distribution problems that are being solved by distributor and eventdev libraries are fundamentally different than those being solved by this vpmd. They would be the wrong tool for the job.
> > 
> > I would agree with the previous statements that this driver is far closer in functionality to the bonded ethdev driver than anything else. It makes multiple devices appear as a single one while hiding the complexity of the multiple devices to the using application. In the same way as the bonded ethdev driver has different modes for active-backup, and for active-active for increased throughput, this vpmd for crypto can have the exact same modes - multiple active bonded devices for higher performance operation, or two devices in active backup to enable migration when using SR-IOV as described by Declan above.
> > 
> > Regards,
> > /Bruce
> > 
> 
> I think that having scheduler in the pmd name here may be somewhat of a
> loaded term and is muddying the waters of the problem we are trying to
> address and I think if we were to rename this to crypto_bond_pmd it may make
> our intent for what we want this pmd to achieve clearer.
> 
> Neil, in most of the initial scheduling use cases we want to address with
> this pmd initially, we are looking to schedule within the context of a
> single lcore on multiple hw accelerators or a mix of hw accelerators and sw
> pmds and therefore using the distributor or the eventdev wouldn't add a lot
> of value.
> 
> Declan

Ok, these are fair points, and I'll concede to them.  That said, it still seems
like a waste to me to ignore the 80% functionality overlap to be had here.  That
is to say, the distributor library does alot of work that both this pmd and the
bonding pmd could benefit from.  Perhaps its worth looking at how to enhance the
distributor library such that worker tasks can be affined to a single cpu, and
the worker assignment can be used as indexed device assignment (the idea being
that a single worker task might represent multiple worker ids in the distributor
library).  that way such a crypto aggregator pmd or the bonding pmd's
implementation is little more than setting tags in mbufs accoring to appropriate
policy.

Neil
  

Patch

diff --git a/config/common_base b/config/common_base
index 4bff83a..79d120d 100644
--- a/config/common_base
+++ b/config/common_base
@@ -400,6 +400,12 @@  CONFIG_RTE_LIBRTE_PMD_KASUMI=n
 CONFIG_RTE_LIBRTE_PMD_KASUMI_DEBUG=n
 
 #
+# Compile PMD for Crypto Scheduler device
+#
+CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER=n
+CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER_DEBUG=n
+
+#
 # Compile PMD for ZUC device
 #
 CONFIG_RTE_LIBRTE_PMD_ZUC=n
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 745c614..cdd3c94 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -38,6 +38,7 @@  DIRS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_SNOW3G) += snow3g
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_KASUMI) += kasumi
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_ZUC) += zuc
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO) += null
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/crypto/scheduler/Makefile b/drivers/crypto/scheduler/Makefile
new file mode 100644
index 0000000..d8e1ff5
--- /dev/null
+++ b/drivers/crypto/scheduler/Makefile
@@ -0,0 +1,64 @@ 
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_crypto_scheduler.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library version
+LIBABIVER := 1
+
+# versioning export map
+EXPORT_MAP := rte_pmd_crypto_scheduler_version.map
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_cryptodev_scheduler.h
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler_pmd.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler_pmd_ops.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += rte_cryptodev_scheduler.c
+
+# library dependencies
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_ring
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_reorder
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_cryptodev
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.c b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
new file mode 100644
index 0000000..e04596c
--- /dev/null
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
@@ -0,0 +1,387 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <rte_jhash.h>
+#include <rte_reorder.h>
+#include <rte_cryptodev.h>
+#include <rte_cryptodev_pmd.h>
+#include <rte_cryptodev_scheduler.h>
+
+#include "scheduler_pmd_private.h"
+
+static int
+request_qp(uint8_t dev_id) {
+	struct rte_cryptodev_info dev_info;
+	struct slave_info key = {dev_id, 0};
+	uint16_t i;
+
+	if (!dev_qp_map) {
+		struct rte_hash_parameters hash_param = {0};
+
+		hash_param.entries = 1024;
+		hash_param.key_len = sizeof(key);
+		hash_param.hash_func = rte_jhash;
+		hash_param.hash_func_init_val = 0;
+		hash_param.socket_id = SOCKET_ID_ANY;
+
+		dev_qp_map = rte_hash_create(&hash_param);
+		if (!dev_qp_map) {
+			CS_LOG_ERR("not enough memory to create hash table");
+			return -ENOMEM;
+		}
+	}
+
+	rte_cryptodev_info_get(dev_id, &dev_info);
+
+	for (i = 0; i < dev_info.max_nb_queue_pairs; i++) {
+		key.qp_id = i;
+
+		if (rte_hash_lookup_data(dev_qp_map, (void *)&key,
+			NULL) == 0)
+			continue;
+
+		if (rte_hash_add_key(dev_qp_map, &key) < 0) {
+			CS_LOG_ERR("not enough memory to insert hash "
+				"entry");
+			return -ENOMEM;
+		}
+
+		break;
+	}
+
+	if (i == dev_info.max_nb_queue_pairs) {
+		CS_LOG_ERR("all queue pairs of cdev %u has already been "
+			"occupied", dev_id);
+		return -1;
+	}
+
+	return i;
+}
+
+static int
+update_reorder_buff(uint8_t dev_id, struct scheduler_private *internal)
+{
+	char reorder_buff_name[32];
+	uint32_t reorder_buff_size = (internal->nb_slaves[SCHED_HW_CDEV] +
+			internal->nb_slaves[SCHED_SW_CDEV]) *
+			PER_SLAVE_BUFF_SIZE;
+
+	if (!internal->use_reorder)
+		return 0;
+
+	if (reorder_buff_size == 0) {
+		if (internal->reorder_buff)
+			rte_reorder_free(internal->reorder_buff);
+		internal->reorder_buff = NULL;
+		return 0;
+	}
+
+	if (internal->reorder_buff)
+		rte_reorder_free(internal->reorder_buff);
+
+	if (snprintf(reorder_buff_name, RTE_CRYPTODEV_NAME_MAX_LEN,
+		"%s_rb_%u", RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),
+		dev_id) < 0) {
+		CS_LOG_ERR("failed to create unique reorder buffer name");
+		return -EFAULT;
+	}
+
+	internal->reorder_buff = rte_reorder_create(
+		reorder_buff_name, rte_socket_id(),
+		reorder_buff_size);
+
+	if (internal->reorder_buff == NULL) {
+		CS_LOG_ERR("failed to allocate reorder buffer");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/** update the scheduler pmd's capability with attaching device's
+ *  capability.
+ *  For each device to be attached, the scheduler's capability should be
+ *  the common capability set of all slaves
+ **/
+static int
+update_sched_capabilities(struct scheduler_private *internal,
+	const struct rte_cryptodev_capabilities *attach_caps)
+{
+	struct rte_cryptodev_capabilities *cap;
+	const struct rte_cryptodev_capabilities *a_cap;
+	uint32_t nb_caps = 0;
+	uint32_t nb_attached_caps = 0, nb_common_caps;
+	uint32_t cap_size = sizeof(struct rte_cryptodev_capabilities);
+	uint32_t i;
+
+	/* find out how many caps the scheduler already has */
+	while (internal->capabilities[nb_attached_caps].op !=
+		RTE_CRYPTO_OP_TYPE_UNDEFINED)
+		nb_attached_caps++;
+
+	/* find out how many capabilities the cdev-to-be-attached has */
+	while (attach_caps[nb_caps].op != RTE_CRYPTO_OP_TYPE_UNDEFINED)
+		nb_caps++;
+
+	nb_common_caps = nb_attached_caps;
+
+	/* init, memcpy whole */
+	if (nb_attached_caps == 0) {
+		if (nb_caps > MAX_CAP_NUM) {
+			CS_LOG_ERR("too many capability items");
+			return -ENOMEM;
+		}
+
+		memset(internal->capabilities, 0, cap_size * MAX_CAP_NUM);
+
+		rte_memcpy(internal->capabilities, attach_caps,
+			cap_size * nb_caps);
+		return 0;
+	}
+
+
+	/* find common capabilities between slave-to-be-attached and self */
+	i = 0;
+
+	while (internal->capabilities[i].op != RTE_CRYPTO_OP_TYPE_UNDEFINED) {
+		cap = &internal->capabilities[i];
+		uint32_t j = 0;
+
+		while (attach_caps[j].op != RTE_CRYPTO_OP_TYPE_UNDEFINED) {
+			a_cap = &attach_caps[j];
+
+			if (a_cap->op != cap->op || a_cap->sym.xform_type !=
+				cap->sym.xform_type) {
+				j++;
+				continue;
+			}
+
+			if (a_cap->sym.xform_type == RTE_CRYPTO_SYM_XFORM_AUTH)
+				if (a_cap->sym.auth.algo !=
+					cap->sym.auth.algo) {
+					j++;
+					continue;
+				}
+
+			if (a_cap->sym.xform_type ==
+					RTE_CRYPTO_SYM_XFORM_CIPHER)
+				if (a_cap->sym.cipher.algo !=
+					cap->sym.cipher.algo) {
+					j++;
+					continue;
+				}
+
+			break;
+		}
+
+		if (j >= nb_attached_caps)
+			nb_common_caps--;
+
+		i++;
+	}
+
+	/* no common capabilities, quit */
+	if (nb_common_caps == 0) {
+		CS_LOG_ERR("incompatible capabilities");
+		return -1;
+	}
+
+	/* remove the capabilities of the scheduler not exist in the cdev*/
+	i = 0;
+	while (internal->capabilities[i].op != RTE_CRYPTO_OP_TYPE_UNDEFINED) {
+		cap = &internal->capabilities[i];
+		uint32_t j = 0;
+
+		while (attach_caps[j].op != RTE_CRYPTO_OP_TYPE_UNDEFINED) {
+			a_cap = &attach_caps[j];
+
+			if (a_cap->op != cap->op || a_cap->sym.xform_type !=
+				cap->sym.xform_type) {
+				j++;
+				continue;
+			}
+
+			if (a_cap->sym.xform_type ==
+					RTE_CRYPTO_SYM_XFORM_AUTH) {
+				if (a_cap->sym.auth.algo !=
+					cap->sym.auth.algo) {
+					j++;
+					continue;
+				}
+
+				/* update digest size of the scheduler,
+				 * as AESNI-MB PMD only use truncated
+				 * digest size.
+				 */
+				cap->sym.auth.digest_size.min =
+					a_cap->sym.auth.digest_size.min <
+					cap->sym.auth.digest_size.min ?
+					a_cap->sym.auth.digest_size.min :
+					cap->sym.auth.digest_size.min;
+				cap->sym.auth.digest_size.max =
+					a_cap->sym.auth.digest_size.max <
+					cap->sym.auth.digest_size.max ?
+					a_cap->sym.auth.digest_size.max :
+					cap->sym.auth.digest_size.max;
+
+				break;
+			}
+
+			if (a_cap->sym.xform_type ==
+				RTE_CRYPTO_SYM_XFORM_CIPHER)
+				if (a_cap->sym.cipher.algo !=
+					cap->sym.cipher.algo) {
+					j++;
+					continue;
+				}
+
+			break;
+		}
+
+		if (j == nb_attached_caps) {
+			uint32_t k;
+
+			for (k = i + 1; k < nb_attached_caps; k++)
+				rte_memcpy(&internal->capabilities[k - 1],
+					&internal->capabilities[k], cap_size);
+
+			memset(&internal->capabilities[
+				nb_attached_caps], 0, cap_size);
+
+			nb_attached_caps--;
+		}
+
+		i++;
+	}
+
+	return 0;
+}
+
+/** Attach a device to the scheduler. */
+int
+rte_cryptodev_scheduler_attach_dev(uint8_t dev_id, uint8_t slave_dev_id)
+{
+	struct rte_cryptodev *dev = rte_cryptodev_pmd_get_dev(dev_id);
+	struct rte_cryptodev *slave_dev =
+		rte_cryptodev_pmd_get_dev(slave_dev_id);
+	struct scheduler_private *internal;
+	struct slave_info *slave;
+	struct rte_cryptodev_info dev_info;
+	uint8_t *idx;
+	int status;
+
+	if (dev->dev_type != RTE_CRYPTODEV_SCHEDULER_PMD) {
+		CS_LOG_ERR("Operation not supported");
+		return -ENOTSUP;
+	}
+
+	internal = (struct scheduler_private *)dev->data->dev_private;
+
+	rte_cryptodev_info_get(slave_dev_id, &dev_info);
+
+	if (dev_info.feature_flags & RTE_CRYPTODEV_FF_HW_ACCELERATED) {
+		idx = &internal->nb_slaves[SCHED_HW_CDEV];
+		slave = &internal->slaves[SCHED_HW_CDEV][*idx];
+	} else {
+		idx = &internal->nb_slaves[SCHED_SW_CDEV];
+		slave = &internal->slaves[SCHED_SW_CDEV][*idx];
+	}
+
+	if (*idx + 1 >= MAX_SLAVES_NUM) {
+		CS_LOG_ERR("too many devices attached");
+		return -ENOMEM;
+	}
+
+	if (update_sched_capabilities(internal, dev_info.capabilities) < 0) {
+		CS_LOG_ERR("capabilities update failed");
+		return -ENOTSUP;
+	}
+
+	slave->dev_id = slave_dev_id;
+	status = request_qp(slave_dev_id);
+	if (status < 0)
+		return -EFAULT;
+	slave->qp_id = (uint16_t)status;
+
+	internal->max_nb_sessions = dev_info.sym.max_nb_sessions <
+		internal->max_nb_sessions ?
+		dev_info.sym.max_nb_sessions : internal->max_nb_sessions;
+
+	dev->feature_flags |= slave_dev->feature_flags;
+
+	*idx += 1;
+
+	return update_reorder_buff(dev_id, internal);
+}
+
+
+int
+rte_crpytodev_scheduler_set_mode(uint8_t dev_id,
+	enum crypto_scheduling_mode mode)
+{
+	struct rte_cryptodev *dev = rte_cryptodev_pmd_get_dev(dev_id);
+	struct scheduler_private *internal = dev->data->dev_private;
+
+	if (mode < CRYPTO_SCHED_SW_ROUND_ROBIN_MODE ||
+		mode >= CRYPTO_SCHED_N_MODES)
+		return -1;
+
+	if (mode == CRYPTO_SCHED_SW_ROUND_ROBIN_MODE) {
+		if (internal->nb_slaves[SCHED_SW_CDEV] == 0)
+			return -1;
+		internal->use_dev_type = SCHED_SW_CDEV;
+	}
+
+	if (mode == CRYPTO_SCHED_HW_ROUND_ROBIN_MODE) {
+		if (internal->nb_slaves[SCHED_HW_CDEV] == 0)
+			return -1;
+		internal->use_dev_type = SCHED_HW_CDEV;
+	}
+
+	scheduler_update_rx_tx_ops(dev, mode, internal->use_reorder);
+
+	internal->mode = mode;
+
+	return 0;
+}
+
+void
+rte_crpytodev_scheduler_get_mode(uint8_t dev_id,
+		enum crypto_scheduling_mode *mode)
+{
+	struct rte_cryptodev *dev = rte_cryptodev_pmd_get_dev(dev_id);
+	struct scheduler_private *internal = dev->data->dev_private;
+
+	if (!mode)
+		return;
+
+	*mode = internal->mode;
+}
diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.h b/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
new file mode 100644
index 0000000..5775037
--- /dev/null
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
@@ -0,0 +1,90 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CRYPTO_SCHEDULER_H
+#define _RTE_CRYPTO_SCHEDULER_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Crypto scheduler PMD operation modes
+ */
+enum crypto_scheduling_mode {
+	/* <Round Robin Mode amongst all software slave cdevs */
+	CRYPTO_SCHED_SW_ROUND_ROBIN_MODE = 1,
+	/* <Round Robin Mode amongst all hardware slave cdevs */
+	CRYPTO_SCHED_HW_ROUND_ROBIN_MODE,
+	CRYPTO_SCHED_N_MODES /* number of modes */
+};
+
+/**
+ * Attach a pre-configured crypto device to the scheduler
+ *
+ * @param	dev_id		The target scheduler device ID
+ *		slave_dev_id	crypto device ID to be attached
+ *
+ * @return
+ *	0 if attaching successful, negative int if otherwise.
+ */
+int
+rte_cryptodev_scheduler_attach_dev(uint8_t dev_id, uint8_t slave_dev_id);
+
+/**
+ * Set the scheduling mode
+ *
+ * @param	dev_id		The target scheduler device ID
+ *		mode		The scheduling mode
+ *
+ * @return
+ *	0 if attaching successful, negative integer if otherwise.
+ */
+int
+rte_crpytodev_scheduler_set_mode(uint8_t dev_id,
+		enum crypto_scheduling_mode mode);
+
+/**
+ * Get the current scheduling mode
+ *
+ * @param	dev_id		The target scheduler device ID
+ *		mode		Pointer to write the scheduling mode
+ */
+void
+rte_crpytodev_scheduler_get_mode(uint8_t dev_id,
+		enum crypto_scheduling_mode *mode);
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _RTE_CRYPTO_SCHEDULER_H */
diff --git a/drivers/crypto/scheduler/rte_pmd_crypto_scheduler_version.map b/drivers/crypto/scheduler/rte_pmd_crypto_scheduler_version.map
new file mode 100644
index 0000000..dab1bfe
--- /dev/null
+++ b/drivers/crypto/scheduler/rte_pmd_crypto_scheduler_version.map
@@ -0,0 +1,8 @@ 
+DPDK_17.02 {
+	global:
+
+	rte_cryptodev_scheduler_attach_dev;
+	rte_crpytodev_scheduler_set_mode;
+	rte_crpytodev_scheduler_get_mode;
+
+} DPDK_17.02;
\ No newline at end of file
diff --git a/drivers/crypto/scheduler/scheduler_pmd.c b/drivers/crypto/scheduler/scheduler_pmd.c
new file mode 100644
index 0000000..37a8b64
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd.c
@@ -0,0 +1,475 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <rte_common.h>
+#include <rte_hexdump.h>
+#include <rte_cryptodev.h>
+#include <rte_cryptodev_pmd.h>
+#include <rte_vdev.h>
+#include <rte_malloc.h>
+#include <rte_cpuflags.h>
+#include <rte_reorder.h>
+#include <rte_cryptodev_scheduler.h>
+
+#include "scheduler_pmd_private.h"
+
+#define SCHEDULER_MAX_NB_QP_ARG		"max_nb_queue_pairs"
+#define SCHEDULER_MAX_NB_SESS_ARG	"max_nb_sessions"
+#define SCHEDULER_SOCKET_ID		"socket_id"
+#define SCHEDULER_ENABLE_REORDER_ARG	"enable_reorder"
+
+const char *scheduler_vdev_valid_params[] = {
+	SCHEDULER_MAX_NB_QP_ARG,
+	SCHEDULER_MAX_NB_SESS_ARG,
+	SCHEDULER_SOCKET_ID,
+	SCHEDULER_ENABLE_REORDER_ARG,
+};
+
+/** Round robin mode burst enqueue */
+static uint16_t
+scheduler_enqueue_burst_rr(void *queue_pair,
+	struct rte_crypto_op **ops, uint16_t nb_ops)
+{
+	uint16_t i, processed_ops;
+	struct scheduler_qp *qp = (struct scheduler_qp *)queue_pair;
+	struct scheduler_private *internal = qp->dev_priv;
+	struct scheduler_session *sess0, *sess1, *sess2, *sess3;
+	uint8_t dev_type_idx = internal->use_dev_type;
+	uint8_t dev_idx = internal->last_enq_idx[dev_type_idx];
+
+	for (i = 0; i < nb_ops && i < 4; i++)
+		rte_prefetch0(ops[i]->sym->session);
+
+	for (i = 0; i < nb_ops - 8; i += 4) {
+		sess0 = (struct scheduler_session *)
+			ops[i]->sym->session->_private;
+		sess1 = (struct scheduler_session *)
+			ops[i + 1]->sym->session->_private;
+		sess2 = (struct scheduler_session *)
+			ops[i + 2]->sym->session->_private;
+		sess3 = (struct scheduler_session *)
+			ops[i + 3]->sym->session->_private;
+
+		ops[i]->sym->session =
+			sess0->slave_sesses[dev_type_idx][dev_idx];
+		ops[i + 1]->sym->session =
+			sess1->slave_sesses[dev_type_idx][dev_idx];
+		ops[i + 2]->sym->session =
+			sess2->slave_sesses[dev_type_idx][dev_idx];
+		ops[i + 3]->sym->session =
+			sess3->slave_sesses[dev_type_idx][dev_idx];
+
+		rte_prefetch0(ops[i + 4]->sym->session);
+		rte_prefetch0(ops[i + 5]->sym->session);
+		rte_prefetch0(ops[i + 6]->sym->session);
+		rte_prefetch0(ops[i + 7]->sym->session);
+	}
+
+	for (; i < nb_ops; i++) {
+		sess0 = (struct scheduler_session *)
+			ops[i]->sym->session->_private;
+		ops[i]->sym->session =
+			sess0->slave_sesses[dev_type_idx][dev_idx];
+	}
+
+	processed_ops = rte_cryptodev_enqueue_burst(
+		internal->slaves[dev_type_idx][dev_idx].dev_id,
+		internal->slaves[dev_type_idx][dev_idx].qp_id,
+		ops, nb_ops);
+
+	internal->last_enq_idx[dev_type_idx] += 1;
+
+	if (unlikely(internal->last_enq_idx[dev_type_idx] >=
+		internal->nb_slaves[dev_type_idx]))
+		internal->last_enq_idx[dev_type_idx] = 0;
+
+	qp->stats.enqueued_count += processed_ops;
+
+	return processed_ops;
+}
+
+/** Round robin mode burst dequeue without post-reorder */
+static uint16_t
+scheduler_dequeue_burst_rr_no_reorder(void *queue_pair,
+	struct rte_crypto_op **ops, uint16_t nb_ops)
+{
+	uint16_t nb_deq_ops;
+	struct scheduler_qp *qp = (struct scheduler_qp *)queue_pair;
+	struct scheduler_private *internal = qp->dev_priv;
+	uint8_t dev_type_idx = internal->use_dev_type;
+	uint8_t dev_idx = internal->last_deq_idx[dev_type_idx];
+
+	nb_deq_ops = rte_cryptodev_dequeue_burst(
+		internal->slaves[dev_type_idx][dev_idx].dev_id,
+		internal->slaves[dev_type_idx][dev_idx].qp_id,
+		ops, nb_ops);
+
+	internal->last_deq_idx[dev_type_idx] += 1;
+	if (unlikely(internal->last_deq_idx[dev_type_idx] >=
+		internal->nb_slaves[dev_type_idx]))
+		internal->last_deq_idx[dev_type_idx] = 0;
+
+	qp->stats.dequeued_count += nb_deq_ops;
+
+	return nb_deq_ops;
+}
+
+/** Round robin mode burst dequeue with post-reorder */
+static uint16_t
+scheduler_dequeue_burst_rr_reorder(void *queue_pair,
+	struct rte_crypto_op **ops, uint16_t nb_ops)
+{
+	uint16_t i, nb_deq_ops;
+	const uint16_t nb_op_ops = nb_ops;
+	struct scheduler_qp *qp = (struct scheduler_qp *)queue_pair;
+	struct scheduler_private *internal = qp->dev_priv;
+	struct rte_mbuf *reorder_mbufs[nb_op_ops];
+	struct rte_mbuf *mbuf0, *mbuf1, *mbuf2, *mbuf3;
+	struct rte_crypto_op *op_ops[nb_op_ops];
+	struct rte_reorder_buffer *reorder_buff =
+		(struct rte_reorder_buffer *)internal->reorder_buff;
+	uint8_t dev_type_idx = internal->use_dev_type;
+	uint8_t dev_idx = internal->last_deq_idx[dev_type_idx];
+
+	nb_deq_ops = rte_cryptodev_dequeue_burst(
+		internal->slaves[dev_type_idx][dev_idx].dev_id,
+		internal->slaves[dev_type_idx][dev_idx].qp_id,
+		op_ops, nb_ops);
+
+	internal->last_deq_idx[dev_type_idx] += 1;
+	if (unlikely(internal->last_deq_idx[dev_type_idx] >=
+		internal->nb_slaves[dev_type_idx]))
+		internal->last_deq_idx[dev_type_idx] = 0;
+
+	for (i = 0; i < nb_deq_ops && i < 4; i++)
+		rte_prefetch0(op_ops[i]->sym->m_src);
+
+	for (i = 0; i < nb_deq_ops - 8; i += 4) {
+		mbuf0 = op_ops[i]->sym->m_src;
+		mbuf1 = op_ops[i + 1]->sym->m_src;
+		mbuf2 = op_ops[i + 2]->sym->m_src;
+		mbuf3 = op_ops[i + 3]->sym->m_src;
+
+		rte_memcpy(mbuf0->buf_addr, &op_ops[i], sizeof(op_ops[i]));
+		rte_memcpy(mbuf1->buf_addr, &op_ops[i + 1],
+			sizeof(op_ops[i + 1]));
+		rte_memcpy(mbuf2->buf_addr, &op_ops[i + 2],
+			sizeof(op_ops[i + 2]));
+		rte_memcpy(mbuf3->buf_addr, &op_ops[i + 3],
+			sizeof(op_ops[i + 3]));
+
+		mbuf0->seqn = internal->seqn++;
+		mbuf1->seqn = internal->seqn++;
+		mbuf2->seqn = internal->seqn++;
+		mbuf3->seqn = internal->seqn++;
+
+		rte_reorder_insert(reorder_buff, mbuf0);
+		rte_reorder_insert(reorder_buff, mbuf1);
+		rte_reorder_insert(reorder_buff, mbuf2);
+		rte_reorder_insert(reorder_buff, mbuf3);
+
+		rte_prefetch0(op_ops[i + 4]->sym->m_src);
+		rte_prefetch0(op_ops[i + 5]->sym->m_src);
+		rte_prefetch0(op_ops[i + 6]->sym->m_src);
+		rte_prefetch0(op_ops[i + 7]->sym->m_src);
+	}
+
+	for (; i < nb_deq_ops; i++) {
+		mbuf0 = op_ops[i]->sym->m_src;
+
+		rte_memcpy(mbuf0->buf_addr, &op_ops[i], sizeof(op_ops[i]));
+
+		mbuf0->seqn = internal->seqn++;
+
+		rte_reorder_insert(reorder_buff, mbuf0);
+	}
+
+	nb_deq_ops = rte_reorder_drain(reorder_buff, reorder_mbufs,
+		nb_ops);
+
+	for (i = 0; i < nb_deq_ops && i < 4; i++)
+		rte_prefetch0(reorder_mbufs[i]);
+
+	for (i = 0; i < nb_deq_ops - 8; i += 4) {
+		ops[i] = *(struct rte_crypto_op **)
+			reorder_mbufs[i]->buf_addr;
+		ops[i + 1] = *(struct rte_crypto_op **)
+			reorder_mbufs[i + 1]->buf_addr;
+		ops[i + 2] = *(struct rte_crypto_op **)
+			reorder_mbufs[i + 2]->buf_addr;
+		ops[i + 3] = *(struct rte_crypto_op **)
+			reorder_mbufs[i + 3]->buf_addr;
+
+		*(struct rte_crypto_op **)reorder_mbufs[i]->buf_addr = NULL;
+		*(struct rte_crypto_op **)reorder_mbufs[i + 1]->buf_addr = NULL;
+		*(struct rte_crypto_op **)reorder_mbufs[i + 2]->buf_addr = NULL;
+		*(struct rte_crypto_op **)reorder_mbufs[i + 3]->buf_addr = NULL;
+
+		rte_prefetch0(reorder_mbufs[i + 4]);
+		rte_prefetch0(reorder_mbufs[i + 5]);
+		rte_prefetch0(reorder_mbufs[i + 6]);
+		rte_prefetch0(reorder_mbufs[i + 7]);
+	}
+
+	for (; i < nb_deq_ops; i++) {
+		ops[i] = *(struct rte_crypto_op **)
+			reorder_mbufs[i]->buf_addr;
+		*(struct rte_crypto_op **)reorder_mbufs[i]->buf_addr = NULL;
+	}
+
+	qp->stats.dequeued_count += nb_deq_ops;
+
+	return nb_deq_ops;
+}
+
+int
+scheduler_update_rx_tx_ops(struct rte_cryptodev *dev,
+	enum crypto_scheduling_mode mode, uint32_t use_reorder)
+{
+	switch (mode) {
+	case CRYPTO_SCHED_SW_ROUND_ROBIN_MODE:
+	case CRYPTO_SCHED_HW_ROUND_ROBIN_MODE:
+		dev->enqueue_burst = scheduler_enqueue_burst_rr;
+		if (use_reorder)
+			dev->dequeue_burst =
+				scheduler_dequeue_burst_rr_reorder;
+		else
+			dev->dequeue_burst =
+				scheduler_dequeue_burst_rr_no_reorder;
+		break;
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
+static uint32_t unique_name_id;
+
+static int
+cryptodev_scheduler_create(const char *name,
+	struct rte_crypto_vdev_init_params *init_params,
+	const uint8_t enable_reorder)
+{
+	char crypto_dev_name[RTE_CRYPTODEV_NAME_MAX_LEN];
+	struct scheduler_private *internal;
+	struct rte_cryptodev *dev;
+
+	if (snprintf(crypto_dev_name, RTE_CRYPTODEV_NAME_MAX_LEN, "%s_%u",
+		RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD), unique_name_id++) < 0) {
+		CS_LOG_ERR("driver %s: failed to create unique cryptodev "
+			"name", name);
+		return -EFAULT;
+	}
+
+	dev = rte_cryptodev_pmd_virtual_dev_init(crypto_dev_name,
+			sizeof(struct scheduler_private),
+			init_params->socket_id);
+	if (dev == NULL) {
+		CS_LOG_ERR("driver %s: failed to create cryptodev vdev",
+			name);
+		return -EFAULT;
+	}
+
+	dev->dev_type = RTE_CRYPTODEV_SCHEDULER_PMD;
+	dev->dev_ops = rte_crypto_scheduler_pmd_ops;
+
+	dev->feature_flags = RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO;
+
+	internal = dev->data->dev_private;
+	internal->max_nb_queue_pairs = init_params->max_nb_queue_pairs;
+	internal->max_nb_sessions = UINT32_MAX;
+	internal->use_reorder = enable_reorder;
+
+	/* register rx/tx burst functions for data path
+	 * by default the software round robin mode is adopted
+	 */
+	return scheduler_update_rx_tx_ops(dev, CRYPTO_SCHED_SW_ROUND_ROBIN_MODE,
+		internal->use_reorder);
+}
+
+static int
+cryptodev_scheduler_remove(const char *name)
+{
+	struct rte_cryptodev *dev;
+	struct scheduler_private *internal;
+
+	if (name == NULL)
+		return -EINVAL;
+
+	dev = rte_cryptodev_pmd_get_named_dev(name);
+	if (dev == NULL)
+		return -EINVAL;
+
+	internal = dev->data->dev_private;
+
+	if (internal->reorder_buff)
+		rte_reorder_free(internal->reorder_buff);
+
+	RTE_LOG(INFO, PMD, "Closing Crypto Scheduler device %s on numa "
+		"socket %u\n", name, rte_socket_id());
+
+	return 0;
+}
+
+/** Parse integer from integer argument */
+static int
+parse_integer_arg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	int *i = (int *) extra_args;
+
+	*i = atoi(value);
+	if (*i < 0) {
+		CDEV_LOG_ERR("Argument has to be positive.");
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Parse reorder enable/disable argument */
+static int
+scheduler_parse_enable_reorder_kvarg(const char *key __rte_unused,
+		const char *value, void *extra_args)
+{
+	if (value == NULL || extra_args == NULL)
+		return -1;
+
+	if (strcmp(value, "yes") == 0)
+		*(uint8_t *)extra_args = 1;
+	else if (strcmp(value, "no") == 0)
+		*(uint8_t *)extra_args = 0;
+	else
+		return -1;
+
+	return 0;
+}
+
+static uint8_t
+number_of_sockets(void)
+{
+	int sockets = 0;
+	int i;
+	const struct rte_memseg *ms = rte_eal_get_physmem_layout();
+
+	for (i = 0; ((i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL)); i++) {
+		if (sockets < ms[i].socket_id)
+			sockets = ms[i].socket_id;
+	}
+
+	/* Number of sockets = maximum socket_id + 1 */
+	return ++sockets;
+}
+
+static int
+scheduler_parse_init_params(struct rte_crypto_vdev_init_params *params,
+	uint8_t *enable_reorder, const char *input_args)
+{
+	struct rte_kvargs *kvlist = NULL;
+	int ret = 0;
+
+	if (params == NULL)
+		return -EINVAL;
+
+	if (!input_args)
+		return 0;
+
+	kvlist = rte_kvargs_parse(input_args,
+			scheduler_vdev_valid_params);
+	if (kvlist == NULL)
+		return -1;
+
+	ret = rte_kvargs_process(kvlist, SCHEDULER_MAX_NB_QP_ARG,
+		&parse_integer_arg, &params->max_nb_queue_pairs);
+	if (ret < 0)
+		goto free_kvlist;
+
+	ret = rte_kvargs_process(kvlist, SCHEDULER_MAX_NB_SESS_ARG,
+		&parse_integer_arg, &params->max_nb_sessions);
+	if (ret < 0)
+		goto free_kvlist;
+
+	ret = rte_kvargs_process(kvlist, SCHEDULER_SOCKET_ID,
+		&parse_integer_arg, &params->socket_id);
+	if (ret < 0)
+		goto free_kvlist;
+
+	if (params->socket_id >= number_of_sockets()) {
+		CDEV_LOG_ERR("Invalid socket id specified to create "
+			"the virtual crypto device on");
+		goto free_kvlist;
+	}
+
+	ret = rte_kvargs_process(kvlist, SCHEDULER_ENABLE_REORDER_ARG,
+		&scheduler_parse_enable_reorder_kvarg, enable_reorder);
+	if (ret < 0)
+		goto free_kvlist;
+
+free_kvlist:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+cryptodev_scheduler_probe(const char *name, const char *input_args)
+{
+	struct rte_crypto_vdev_init_params init_params = {
+		RTE_CRYPTODEV_VDEV_DEFAULT_MAX_NB_QUEUE_PAIRS,
+		RTE_CRYPTODEV_VDEV_DEFAULT_MAX_NB_SESSIONS,
+		rte_socket_id()
+	};
+	uint8_t enable_reorder = 0;
+
+	scheduler_parse_init_params(&init_params, &enable_reorder, input_args);
+
+	RTE_LOG(INFO, PMD, "Initialising %s on NUMA node %d\n", name,
+			init_params.socket_id);
+	RTE_LOG(INFO, PMD, "  Max number of queue pairs = %d\n",
+			init_params.max_nb_queue_pairs);
+	RTE_LOG(INFO, PMD, "  Max number of sessions = %d\n",
+			init_params.max_nb_sessions);
+
+	return cryptodev_scheduler_create(name, &init_params, enable_reorder);
+}
+
+static struct rte_vdev_driver cryptodev_scheduler_pmd_drv = {
+	.probe = cryptodev_scheduler_probe,
+	.remove = cryptodev_scheduler_remove
+};
+
+RTE_PMD_REGISTER_VDEV(CRYPTODEV_NAME_SCHEDULER_PMD,
+	cryptodev_scheduler_pmd_drv);
+RTE_PMD_REGISTER_PARAM_STRING(CRYPTODEV_NAME_SCHEDULER_PMD,
+	"max_nb_queue_pairs=<int> "
+	"max_nb_sessions=<int> "
+	"socket_id=<int> "
+	"enable_reorder=yes/no");
diff --git a/drivers/crypto/scheduler/scheduler_pmd_ops.c b/drivers/crypto/scheduler/scheduler_pmd_ops.c
new file mode 100644
index 0000000..a98a127
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd_ops.c
@@ -0,0 +1,335 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <string.h>
+
+#include <rte_config.h>
+#include <rte_common.h>
+#include <rte_malloc.h>
+#include <rte_cryptodev.h>
+#include <rte_cryptodev_pmd.h>
+#include <rte_reorder.h>
+
+#include "../scheduler/scheduler_pmd_private.h"
+
+/** Configure device */
+static int
+scheduler_pmd_config(__rte_unused struct rte_cryptodev *dev)
+{
+	return 0;
+}
+
+/** Start device */
+static int
+scheduler_pmd_start(struct rte_cryptodev *dev)
+{
+	struct scheduler_private *internal = dev->data->dev_private;
+	uint32_t i, j;
+
+	/* TODO: this may cause one dev being started multiple times,
+	 * so far as all dev's start functions only returns 0, so it doesn't
+	 * matter yet. However whenever a new dev driver is added and doesn't
+	 * allow its start func being called more than once, this need to
+	 * be updated.
+	 */
+	for (i = 0; i < 2; i++)
+		for (j = 0; j < internal->nb_slaves[i]; j++) {
+			int status = rte_cryptodev_start(
+				internal->slaves[i][j].dev_id);
+			if (status < 0) {
+				CS_LOG_ERR("cannot start device %u",
+					internal->slaves[i][j].dev_id);
+				return status;
+			}
+		}
+
+	return 0;
+}
+
+/** Stop device */
+static void
+scheduler_pmd_stop(struct rte_cryptodev *dev)
+{
+	struct scheduler_private *internal = dev->data->dev_private;
+	uint32_t i, j;
+
+	for (i = 0; i < 2; i++)
+		for (j = 0; j < internal->nb_slaves[i]; j++)
+			rte_cryptodev_stop(internal->slaves[i][j].dev_id);
+}
+
+/** Close device */
+static int
+scheduler_pmd_close(struct rte_cryptodev *dev)
+{
+	struct scheduler_private *internal = dev->data->dev_private;
+	uint32_t i, j;
+
+	for (i = 0; i < 2; i++)
+		for (j = 0; j < internal->nb_slaves[i]; j++) {
+			int status = rte_cryptodev_close(
+				internal->slaves[i][j].dev_id);
+			if (status < 0) {
+				CS_LOG_ERR("cannot close device %u",
+					internal->slaves[i][j].dev_id);
+				return status;
+			}
+		}
+
+	return 0;
+}
+
+/** Get device statistics */
+static void
+scheduler_pmd_stats_get(struct rte_cryptodev *dev,
+	struct rte_cryptodev_stats *stats)
+{
+	int qp_id;
+
+	for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) {
+		struct scheduler_qp *qp = dev->data->queue_pairs[qp_id];
+
+		stats->enqueued_count += qp->stats.enqueued_count;
+		stats->dequeued_count += qp->stats.dequeued_count;
+
+		stats->enqueue_err_count += qp->stats.enqueue_err_count;
+		stats->dequeue_err_count += qp->stats.dequeue_err_count;
+	}
+}
+
+/** Reset device statistics */
+static void
+scheduler_pmd_stats_reset(struct rte_cryptodev *dev)
+{
+	int qp_id;
+
+	for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) {
+		struct scheduler_qp *qp = dev->data->queue_pairs[qp_id];
+
+		memset(&qp->stats, 0, sizeof(qp->stats));
+	}
+}
+
+/** Get device info */
+static void
+scheduler_pmd_info_get(struct rte_cryptodev *dev,
+		struct rte_cryptodev_info *dev_info)
+{
+	struct scheduler_private *internal = dev->data->dev_private;
+
+	if (dev_info != NULL) {
+		dev_info->dev_type = dev->dev_type;
+		dev_info->feature_flags = dev->feature_flags;
+		dev_info->capabilities = internal->capabilities;
+		dev_info->max_nb_queue_pairs = internal->max_nb_queue_pairs;
+		dev_info->sym.max_nb_sessions = internal->max_nb_sessions;
+	}
+}
+
+/** Release queue pair */
+static int
+scheduler_pmd_qp_release(struct rte_cryptodev *dev, uint16_t qp_id)
+{
+	if (dev->data->queue_pairs[qp_id] != NULL) {
+		rte_free(dev->data->queue_pairs[qp_id]);
+		dev->data->queue_pairs[qp_id] = NULL;
+	}
+	return 0;
+}
+
+/** Setup a queue pair */
+static int
+scheduler_pmd_qp_setup(struct rte_cryptodev *dev, uint16_t qp_id,
+	__rte_unused const struct rte_cryptodev_qp_conf *qp_conf, int socket_id)
+{
+	struct scheduler_qp *qp = NULL;
+
+	/* Free memory prior to re-allocation if needed. */
+	if (dev->data->queue_pairs[qp_id] != NULL)
+		scheduler_pmd_qp_release(dev, qp_id);
+
+	/* Allocate the queue pair data structure. */
+	qp = rte_zmalloc_socket("CRYPTO-SCHEDULER PMD Queue Pair",
+		sizeof(*qp), RTE_CACHE_LINE_SIZE, socket_id);
+	if (qp == NULL)
+		return -ENOMEM;
+
+	qp->id = qp_id;
+	dev->data->queue_pairs[qp_id] = qp;
+	memset(&qp->stats, 0, sizeof(qp->stats));
+
+	if (snprintf(qp->name, sizeof(qp->name),
+		"scheduler_pmd_%u_qp_%u", dev->data->dev_id,
+		qp->id) > (int)sizeof(qp->name)) {
+		CS_LOG_ERR("unable to create unique name for queue pair");
+		rte_free(qp);
+		return -EFAULT;
+	}
+
+	qp->dev_priv = dev->data->dev_private;
+
+	return 0;
+}
+
+/** Start queue pair */
+static int
+scheduler_pmd_qp_start(__rte_unused struct rte_cryptodev *dev,
+		__rte_unused uint16_t queue_pair_id)
+{
+	return -ENOTSUP;
+}
+
+/** Stop queue pair */
+static int
+scheduler_pmd_qp_stop(__rte_unused struct rte_cryptodev *dev,
+		__rte_unused uint16_t queue_pair_id)
+{
+	return -ENOTSUP;
+}
+
+/** Return the number of allocated queue pairs */
+static uint32_t
+scheduler_pmd_qp_count(struct rte_cryptodev *dev)
+{
+	return dev->data->nb_queue_pairs;
+}
+
+static unsigned
+scheduler_pmd_session_get_size(struct rte_cryptodev *dev __rte_unused)
+{
+	return sizeof(struct scheduler_session);
+}
+
+static int
+config_slave_sessions(struct scheduler_private *internal,
+	struct rte_crypto_sym_xform *xform,
+	struct scheduler_session *sess,
+	uint32_t create)
+{
+
+	uint32_t i, j;
+
+	for (i = 0; i < 2; i++) {
+		for (j = 0; j < internal->nb_slaves[i]; j++) {
+			uint8_t dev_id = internal->slaves[i][j].dev_id;
+			struct rte_cryptodev *dev = &rte_cryptodev_globals->
+				devs[dev_id];
+
+			/* clear */
+			if (!create) {
+				if (!sess->slave_sesses[i][j])
+					continue;
+
+				dev->dev_ops->session_clear(dev,
+					(void *)sess->slave_sesses[i][j]);
+				sess->slave_sesses[i][j] = NULL;
+
+				continue;
+			}
+
+			/* configure */
+			if (sess->slave_sesses[i][j] == NULL)
+				sess->slave_sesses[i][j] =
+					rte_cryptodev_sym_session_create(
+						dev_id, xform);
+			else
+				sess->slave_sesses[i][j] =
+					dev->dev_ops->session_configure(dev,
+						xform,
+						sess->slave_sesses[i][j]);
+
+			if (!sess->slave_sesses[i][j]) {
+				CS_LOG_ERR("unabled to config sym session");
+				config_slave_sessions(internal, NULL, sess, 0);
+				return -1;
+			}
+		}
+
+		for (j = internal->nb_slaves[i]; j < MAX_SLAVES_NUM; j++)
+			sess->slave_sesses[i][j] = NULL;
+	}
+
+	return 0;
+}
+
+/** Clear the memory of session so it doesn't leave key material behind */
+static void
+scheduler_pmd_session_clear(struct rte_cryptodev *dev,
+	void *sess)
+{
+	struct scheduler_private *internal = dev->data->dev_private;
+
+	config_slave_sessions(internal, NULL, sess, 0);
+
+	memset(sess, 0, sizeof(struct scheduler_session));
+}
+
+
+
+static void *
+scheduler_pmd_session_configure(struct rte_cryptodev *dev,
+	struct rte_crypto_sym_xform *xform, void *sess)
+{
+	struct scheduler_private *internal = dev->data->dev_private;
+
+	if (config_slave_sessions(internal, xform, sess, 1) < 0) {
+		CS_LOG_ERR("unabled to config sym session");
+		scheduler_pmd_session_clear(dev, sess);
+		return NULL;
+	}
+
+	return sess;
+}
+
+
+struct rte_cryptodev_ops scheduler_pmd_ops = {
+		.dev_configure		= scheduler_pmd_config,
+		.dev_start		= scheduler_pmd_start,
+		.dev_stop		= scheduler_pmd_stop,
+		.dev_close		= scheduler_pmd_close,
+
+		.stats_get		= scheduler_pmd_stats_get,
+		.stats_reset		= scheduler_pmd_stats_reset,
+
+		.dev_infos_get		= scheduler_pmd_info_get,
+
+		.queue_pair_setup	= scheduler_pmd_qp_setup,
+		.queue_pair_release	= scheduler_pmd_qp_release,
+		.queue_pair_start	= scheduler_pmd_qp_start,
+		.queue_pair_stop	= scheduler_pmd_qp_stop,
+		.queue_pair_count	= scheduler_pmd_qp_count,
+
+		.session_get_size	= scheduler_pmd_session_get_size,
+		.session_configure	= scheduler_pmd_session_configure,
+		.session_clear		= scheduler_pmd_session_clear,
+};
+
+struct rte_cryptodev_ops *rte_crypto_scheduler_pmd_ops = &scheduler_pmd_ops;
diff --git a/drivers/crypto/scheduler/scheduler_pmd_private.h b/drivers/crypto/scheduler/scheduler_pmd_private.h
new file mode 100644
index 0000000..db605b8
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd_private.h
@@ -0,0 +1,137 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SCHEDULER_PMD_PRIVATE_H
+#define _SCHEDULER_PMD_PRIVATE_H
+
+#include <rte_hash.h>
+#include <rte_cryptodev_scheduler.h>
+
+/**< Maximum number of bonded devices per devices */
+#ifndef MAX_SLAVES_NUM
+#define MAX_SLAVES_NUM				(8)
+#endif
+
+/**< Maximum number of bonded capabilities */
+#ifndef MAX_CAP_NUM
+#define MAX_CAP_NUM				(32)
+#endif
+
+/**< Maximum Crypto OP burst number */
+#ifndef MAX_OP_BURST_NUM
+#define	MAX_OP_BURST_NUM			(32)
+#endif
+
+#define PER_SLAVE_BUFF_SIZE			(256)
+
+#define CS_LOG_ERR(fmt, args...)					\
+	RTE_LOG(ERR, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",		\
+		RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),			\
+		__func__, __LINE__, ## args)
+
+#ifdef RTE_LIBRTE_CRYPTO_SCHEDULER_DEBUG
+#define CS_LOG_INFO(fmt, args...)					\
+	RTE_LOG(INFO, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",	\
+		RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),			\
+		__func__, __LINE__, ## args)
+
+#define CS_LOG_DBG(fmt, args...)					\
+	RTE_LOG(DEBUG, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",	\
+		RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),			\
+		__func__, __LINE__, ## args)
+#else
+#define CS_LOG_INFO(fmt, args...)
+#define CS_LOG_DBG(fmt, args...)
+#endif
+
+/* global hash table storing occupied cdev/qp info */
+struct rte_hash *dev_qp_map;
+
+struct slave_info {
+	uint8_t dev_id;
+	uint16_t qp_id;
+};
+
+#define SCHED_SW_CDEV	0
+#define SCHED_HW_CDEV	1
+
+/* function pointer for different modes' enqueue/dequeue ops */
+typedef uint16_t (*sched_enq_deq_t)(void *queue_pair,
+	struct rte_crypto_op **ops, uint16_t nb_ops);
+
+struct scheduler_private {
+	struct slave_info slaves[2][MAX_SLAVES_NUM];
+	uint8_t nb_slaves[2];
+	uint8_t last_enq_idx[2];
+	uint8_t last_deq_idx[2];
+
+	void *reorder_buff;
+
+	sched_enq_deq_t enqueue;
+	sched_enq_deq_t dequeue;
+
+	enum crypto_scheduling_mode mode;
+
+	uint32_t seqn;
+	uint8_t use_dev_type;
+
+	uint8_t use_reorder;
+
+	struct rte_cryptodev_capabilities
+		capabilities[MAX_CAP_NUM];
+	uint32_t max_nb_queue_pairs;
+	uint32_t max_nb_sessions;
+} __rte_cache_aligned;
+
+struct scheduler_qp {
+	uint16_t id;
+	/**< Queue Pair Identifier */
+	char name[RTE_CRYPTODEV_NAME_LEN];
+	/**< Unique Queue Pair Name */
+	struct rte_cryptodev_stats stats;
+	/**< Queue pair statistics */
+	struct scheduler_private *dev_priv;
+} __rte_cache_aligned;
+
+struct scheduler_session {
+	struct rte_cryptodev_sym_session *slave_sesses[2][MAX_SLAVES_NUM];
+};
+
+/** device specific operations function pointer structure */
+extern struct rte_cryptodev_ops *rte_crypto_scheduler_pmd_ops;
+
+int
+scheduler_update_rx_tx_ops(struct rte_cryptodev *dev,
+	enum crypto_scheduling_mode mode, uint32_t use_reorder);
+
+#endif /* _SCHEDULER_PMD_PRIVATE_H */
diff --git a/lib/librte_cryptodev/rte_cryptodev.h b/lib/librte_cryptodev/rte_cryptodev.h
index 8f63e8f..3aa70af 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -66,6 +66,7 @@  extern "C" {
 /**< KASUMI PMD device name */
 #define CRYPTODEV_NAME_ZUC_PMD		crypto_zuc
 /**< KASUMI PMD device name */
+#define CRYPTODEV_NAME_SCHEDULER_PMD	crypto_scheduler
 
 /** Crypto device type */
 enum rte_cryptodev_type {
@@ -77,6 +78,7 @@  enum rte_cryptodev_type {
 	RTE_CRYPTODEV_KASUMI_PMD,	/**< KASUMI PMD */
 	RTE_CRYPTODEV_ZUC_PMD,		/**< ZUC PMD */
 	RTE_CRYPTODEV_OPENSSL_PMD,    /**<  OpenSSL PMD */
+	RTE_CRYPTODEV_SCHEDULER_PMD,	/**< Crypto Scheduler PMD */
 };
 
 extern const char **rte_cyptodev_names;
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index f75f0e2..ee34688 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -70,7 +70,6 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_PORT)           += -lrte_port
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
-_LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
@@ -98,6 +97,7 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CFGFILE)        += -lrte_cfgfile
+_LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BOND)       += -lrte_pmd_bond
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)    += -lrte_pmd_xenvirt -lxenstore
@@ -145,6 +145,7 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KASUMI)      += -lrte_pmd_kasumi
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_KASUMI)      += -L$(LIBSSO_KASUMI_PATH)/build -lsso_kasumi
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_ZUC)         += -lrte_pmd_zuc
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_ZUC)         += -L$(LIBSSO_ZUC_PATH)/build -lsso_zuc
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER)  += -lrte_pmd_crypto_scheduler
 endif # CONFIG_RTE_LIBRTE_CRYPTODEV
 
 endif # !CONFIG_RTE_BUILD_SHARED_LIBS