[dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Jerin Jacob jerinjacobk at gmail.com
Sat Feb 8 08:22:56 CET 2020

Previous message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Next message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Feb 8, 2020 at 2:04 AM Stephen Hemminger
<stephen at networkplumber.org> wrote:
>
> On Fri, 7 Feb 2020 19:48:17 +0530
> Jerin Jacob <jerinjacobk at gmail.com> wrote:
>
> > On Fri, Feb 7, 2020 at 6:08 PM Coyle, David <david.coyle at intel.com> wrote:
> > >
> > > Hi Jerin, see below
> >
> > Hi David,
> >
> > > >
> > > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.coyle at intel.com>
> > > > wrote:
> > > >
> >
> > > >
> > > > There is a risk in drafting API that meant for HW without any HW exists.
> > > > Because there could be inefficiency on the metadata and fast path API for
> > > > both models.
> > > > For example, In the case of CPU based scheme, it will be pure overhead
> > > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> > > > where CPU works better in the synchronous model and I have doubt that the
> > > > session-based scheme will work for HW or not as both difference  HW needs
> > > > to work hand in hand(IOMMU aspects for two PCI device)
> > >
> > > [DC] I understand what you are saying about the overhead of emulating the "sw queue" but this same model is already used in many of the existing device PMDs.
> > > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for compression, the enqueue/dequeue in the PMD is emulated through an rte_ring which is very efficient.
> > > The accelerator API will use the existing device PMDs so keeping the same model seems like a sensible approach.
> >
> > In this release, we added CPU crypto support in cryptodev to support
> > the synchronous model to fix the overhead.
> >
> > >
> > > From an application's point of view, this abstraction of the underlying device type is important for usability and maintainability -  the application doesn't need to know
> > > the device type as such and therefore doesn't need to make different API calls.
> > >
> > > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW doesn't support these xform chains at the moment, it could potentially do so in the future.
> > > As a side note, as part of the work of adding the accelerator API, the QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where the Crypto
> > > is done on QAT HW and the CRC will be done in SW, most likely through a call to the optimized rte_net_crc library. This will give a consistent API for the DOCSIS-MAC data-plane
> > > pipeline prototype we have developed, which uses both AESNI-MB and QAT for benchmarks.
> > >
> > > We will take your feedback on the enqueue/dequeue approach for SW devices into consideration though during development.
> > >
> > > Finally, I'm unsure what you mean by this line:
> > >
> > >         "I have doubt that the session-based scheme will work for HW or not as both difference  HW needs to work hand in hand(IOMMU aspects for two PCI device)"
> > >
> > > What do mean by different HW working "hand in hand" and "two PCI device"?
> > > The intention is that 1 HW device (or it's PMD) would have to support the accel xform chain
> >
> > I was thinking, it will be N PCIe devices that create the chain. Each
> > distinct PCI device does the fixed-function and chains them together.
> >
> > I do understand the usage of QAT HW and CRC in SW.
> > So If I understand it correctly, in rte_security, we are combining
> > rte_ethdev and rte_cryptodev. With this spec, we are trying to
> > combine,
> > rte_cryptodev and rte_compressdev. So it looks good to me. My only
> > remaining concern is the name of this API, accelerator too generic
> > name. IMO, like rte_security, we may need to give more meaningful name
> > for the use case where crytodev and compressdev can work together.
>
> Having an API that could be used by parallel hardware does make sense,
> but the DPDK already has multiple packet processing infrastructure pieces.
>
> I would rather the DPDK converge on one widely used, robust and tested packet
> method. Rather than the current "choose your poison or roll your own" which is
> what we have now. The proposed graph seems to be the best so far.

I agree. Even I thought of saying graph can do this, as, it has higher
abstraction and runtime chaining support, but then I thought it will
be self markering.
David could you check https://www.mail-archive.com/dev@dpdk.org/msg156318.html
If this one only focusing crypto dev + compressdev, What if we have
ethdev + compressdev + security device in the future.
graph has higher abstraction so it can accommodate ANY chaining
requirements. i.e AESNI-MB + QAT will go as a separate node

Previous message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Next message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list