[dpdk-dev] [PATCH] gpudev: introduce memory API

Wang, Haiyue haiyue.wang at intel.com
Mon Jun 7 09:20:06 CEST 2021


> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Sent: Sunday, June 6, 2021 09:14
> To: Jerin Jacob <jerinjacobk at gmail.com>; Wang, Haiyue <haiyue.wang at intel.com>
> Cc: thomas at monjalon.net; Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>; Yigit, Ferruh
> <ferruh.yigit at intel.com>; dpdk-dev <dev at dpdk.org>; Elena Agostini <eagostini at nvidia.com>; David
> Marchand <david.marchand at redhat.com>; nd <nd at arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>
> Subject: RE: [dpdk-dev] [PATCH] gpudev: introduce memory API
> 
> <snip>
> 
> > > >
> > > > 04/06/2021 17:20, Jerin Jacob:
> > > > > On Fri, Jun 4, 2021 at 7:39 PM Thomas Monjalon
> > <thomas at monjalon.net> wrote:
> > > > > > 04/06/2021 15:59, Andrew Rybchenko:
> > > > > > > On 6/4/21 4:18 PM, Thomas Monjalon wrote:
> > > > > > > > 04/06/2021 15:05, Andrew Rybchenko:
> > > > > > > >> On 6/4/21 3:46 PM, Thomas Monjalon wrote:
> > > > > > > >>> 04/06/2021 13:09, Jerin Jacob:
> > > > > > > >>>> On Fri, Jun 4, 2021 at 3:58 PM Thomas Monjalon
> > <thomas at monjalon.net> wrote:
> > > > > > > >>>>> 03/06/2021 11:33, Ferruh Yigit:
> > > > > > > >>>>>> On 6/3/2021 8:47 AM, Jerin Jacob wrote:
> > > > > > > >>>>>>> On Thu, Jun 3, 2021 at 2:05 AM Thomas Monjalon
> > <thomas at monjalon.net> wrote:
> > > > > > > >>>>>>>> +  [gpudev]             (@ref rte_gpudev.h),
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Since this device does not have a queue etc? Shouldn't
> > > > > > > >>>>>>> make it a library like mempool with vendor-defined ops?
> > > > > > > >>>>>>
> > > > > > > >>>>>> +1
> > > > > > > >>>>>>
> > > > > > > >>>>>> Current RFC announces additional memory allocation
> > > > > > > >>>>>> capabilities, which can suits better as extension to
> > > > > > > >>>>>> existing memory related library instead of a new device
> > abstraction library.
> > > > > > > >>>>>
> > > > > > > >>>>> It is not replacing mempool.
> > > > > > > >>>>> It is more at the same level as EAL memory management:
> > > > > > > >>>>> allocate simple buffer, but with the exception it is
> > > > > > > >>>>> done on a specific device, so it requires a device ID.
> > > > > > > >>>>>
> > > > > > > >>>>> The other reason it needs to be a full library is that
> > > > > > > >>>>> it will start a workload on the GPU and get completion
> > > > > > > >>>>> notification so we can integrate the GPU workload in a packet
> > processing pipeline.
> > > > > > > >>>>
> > > > > > > >>>> I might have confused you. My intention is not to make to fit
> > under mempool API.
> > > > > > > >>>>
> > > > > > > >>>> I agree that we need a separate library for this. My
> > > > > > > >>>> objection is only to not call libgpudev and call it
> > > > > > > >>>> libgpu. And have APIs with rte_gpu_ instead of
> > > > > > > >>>> rte_gpu_dev as it not like existing "device libraries" in
> > > > > > > >>>> DPDK and it like other "libraries" in DPDK.
> > > > > > > >>>
> > > > > > > >>> I think we should define a queue of processing actions, so
> > > > > > > >>> it looks like other device libraries.
> > > > > > > >>> And anyway I think a library managing a device class, and
> > > > > > > >>> having some device drivers deserves the name of device library.
> > > > > > > >>>
> > > > > > > >>> I would like to read more opinions.
> > > > > > > >>
> > > > > > > >> Since the library is an unified interface to GPU device
> > > > > > > >> drivers I think it should be named as in the patch - gpudev.
> > > > > > > >>
> > > > > > > >> Mempool looks like an exception here - initially it was
> > > > > > > >> pure SW library, but not there are HW backends and
> > > > > > > >> corresponding device drivers.
> > > > > > > >>
> > > > > > > >> What I don't understand where is GPU specifics here?
> > > > > > > >
> > > > > > > > That's an interesting question.
> > > > > > > > Let's ask first what is a GPU for DPDK?
> > > > > > > > I think it is like a sub-CPU with high parallel execution
> > > > > > > > capabilities, and it is controlled by the CPU.
> > > > > > >
> > > > > > > I have no good ideas how to name it in accordance with above
> > > > > > > description to avoid "G" which for "Graphics" if understand
> > > > > > > correctly. However, may be it is not required.
> > > > > > > No strong opinion on the topic, but unbinding from "Graphics"
> > > > > > > would be nice.
> > > > > >
> > > > > > That's a question I ask myself for months now.
> > > > > > I am not able to find a better name, and I start thinking that
> > > > > > "GPU" is famous enough in high-load computing to convey the idea
> > > > > > of what we can expect.
> > > > >
> > > > >
> > > > > The closest I can think of is big-little architecture in ARM SoC.
> > > > > https://www.arm.com/why-arm/technologies/big-little
> From the application pov, big-little arch is nothing but SMT. Not sure how it is similar to another
> device on PCIe.
> 
> > > > >
> > > > > We do have similar architecture, Where the "coprocessor" is part
> > > > > of the main CPU.
> > > > > It is operations are:
> > > > > - Download firmware
> > > > > - Memory mapping for Main CPU memory by the co-processor
> > > > > - Enq/Deq Jobs from/to Main CPU/Coprocessor CPU.
> > > >
> > > > Yes it looks like the exact same scope.
> > > > I like the word "co-processor" in this context.
> > > >
> > > > > If your scope is something similar and No Graphics involved here
> > > > > then we can remove G.
> > > >
> > > > Indeed no graphics in DPDK :)
> > > > By removing the G, you mean keeping only PU? like "pudev"?
> > > > We could also define the G as "General".
> > > >
> > > > > Coincidentally, Yesterday, I had an interaction with Elena for the
> > > > > same for BaseBand related work in ORAN where GPU used as Baseband
> > > > > processing instead of Graphics.(So I can understand the big
> > > > > picture of this library)
> This patch does not provide the big picture view of what the processing looks like using GPU. It would
> be good to explain that.
> For ex:
> 1) Will the notion of GPU hidden from the application? i.e. is the application allowed to launch
> kernels?
> 	1a) Will DPDK provide abstract APIs to launch kernels?
>      This would require us to have the notion of GPU in DPDK and the application would depend on the
> availability of GPU in the system.
> 2) Is launching kernels hidden? i.e. the application still calls DPDK abstract APIs (such as
> encryption/decryption APIs) without knowing that the encryption/decryption is happening on GPU.
>      This does not require us to have a notion of GPU in DPDK at the API level
> 
> If we keep CXL in mind, I would imagine that in the future the devices on PCIe could have their own
> local memory. May be some of the APIs could use generic names. For ex: instead of calling it as
> "rte_gpu_malloc" may be we could call it as "rte_dev_malloc". This way any future device which hosts
> its own memory that need to be managed by the application, can use these APIs.
> 

"rte_dev_malloc" sounds a good name, then looks like we need to enhance the
'struct rte_device' with some new ops as:

eal: move DMA mapping from bus-specific to generic driver

https://patchwork.dpdk.org/project/dpdk/patch/20210331224547.2217759-1-thomas@monjalon.net/

> 
> > > >
> > > > Yes baseband processing is one possible usage of GPU with DPDK.
> > > > We could also imagine some security analysis, or any machine learning...
> > > >
> > > > > I can think of "coprocessor-dev" as one of the name.
> > > >
> > > > "coprocessor" looks too long as prefix of the functions.
> >
> > Yes. Libray name can be lengthy, but API prefix should be 3 letters kind short
> > form will be required.
> >
> >
> > > >
> > > > > We do have similar machine learning co-processors(for compute) if
> > > > > we can keep a generic name and it is for the above functions we
> > > > > may use this subsystem as well in the future.
> > > >
> > >
> > > Accelerator, 'acce_dev' ? ;-)
> >
> > It may get confused with HW accelerators.
> >
> >
> > Some of the options I can think of. Sorting in my preference.
> >
> > library name, API prefix
> > 1) libhpc-dev, rte_hpc_ (hpc-> Heterogeneous processor compute)
> > 2) libhc-dev, rte_hc_
> > (https://en.wikipedia.org/wiki/Heterogeneous_computing see: Example
> > hardware)
> > 3) libpu-dev, rte_pu_ (pu -> processing unit)
> > 4) libhp-dev, rte_hp_ (hp->heterogeneous processor)
> > 5) libcoprocessor-dev, rte_cps_ ?
> > 6) libcompute-dev, rte_cpt_ ?
> > 7) libgpu-dev, rte_gpu_
> These seem to assume that the application can launch its own workload on the device? Does DPDK need to
> provide abstract APIs for launching work on a device?
> 
> 
> >
> >
> >
> >
> > >
> > > > Yes that's the idea to share a common synchronization mechanism with
> > > > different HW.
> > > >
> > > > That's cool to have such a big interest in the community for this patch.
> > > >
> > >


More information about the dev mailing list