diff mbox series

[v2] dmadev: introduce DMA device library

Message ID	1625995556-41473-1-git-send-email-fengchengwen@huawei.com (mailing list archive)
State	Superseded, archived
Delegated to:	Thomas Monjalon
Headers	From: Chengwen Feng <fengchengwen@huawei.com> To: <thomas@monjalon.net>, <ferruh.yigit@intel.com>, <bruce.richardson@intel.com>, <jerinj@marvell.com>, <jerinjacobk@gmail.com> CC: <dev@dpdk.org>, <mb@smartsharesystems.com>, <nipun.gupta@nxp.com>, <hemant.agrawal@nxp.com>, <maxime.coquelin@redhat.com>, <honnappa.nagarahalli@arm.com>, <david.marchand@redhat.com>, <sburla@marvell.com>, <pkapoor@marvell.com>, <konstantin.ananyev@intel.com>, <liangma@liangbit.com> Date: Sun, 11 Jul 2021 17:25:56 +0800 Message-ID: <1625995556-41473-1-git-send-email-fengchengwen@huawei.com> In-Reply-To: <1625231891-2963-1-git-send-email-fengchengwen@huawei.com> References: <1625231891-2963-1-git-send-email-fengchengwen@huawei.com> MIME-Version: 1.0 Content-Type: text/plain Subject: [dpdk-dev] [PATCH v2] dmadev: introduce DMA device library Precedence: list Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org>
Series	[v2] dmadev: introduce DMA device library \| [v2] dmadev: introduce DMA device library

Checks

Context	Check	Description
ci/checkpatch	warning	coding style issues
ci/github-robot	success	github build: passed
ci/iol-abi-testing	warning	Testing issues
ci/iol-intel-Performance	success	Performance Testing PASS
ci/iol-intel-Functional	success	Functional Testing PASS
ci/iol-testing	fail	Testing issues
ci/Intel-compilation	success	Compilation OK
ci/intel-Testing	success	Testing PASS

Commit Message

fengchengwen July 11, 2021, 9:25 a.m. UTC

  This patch introduce 'dmadevice' which is a generic type of DMA
device.

The APIs of dmadev library exposes some generic operations which can
enable configuration and I/O with the DMA devices.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 MAINTAINERS                  |    4 +
 config/rte_config.h          |    3 +
 lib/dmadev/meson.build       |    6 +
 lib/dmadev/rte_dmadev.c      |  560 +++++++++++++++++++++++
 lib/dmadev/rte_dmadev.h      | 1030 ++++++++++++++++++++++++++++++++++++++++++
 lib/dmadev/rte_dmadev_core.h |  159 +++++++
 lib/dmadev/rte_dmadev_pmd.h  |   72 +++
 lib/dmadev/version.map       |   40 ++
 lib/meson.build              |    1 +
 9 files changed, 1875 insertions(+)
 create mode 100644 lib/dmadev/meson.build
 create mode 100644 lib/dmadev/rte_dmadev.c
 create mode 100644 lib/dmadev/rte_dmadev.h
 create mode 100644 lib/dmadev/rte_dmadev_core.h
 create mode 100644 lib/dmadev/rte_dmadev_pmd.h
 create mode 100644 lib/dmadev/version.map

Comments

fengchengwen July 11, 2021, 9:42 a.m. UTC | #1

Note:
1) This patch hold dmadev <> vchan layer, I think vchan can be very
   conceptually separated from hw-channel.
2) I could not under struct dpi_dma_queue_ctx_s, so this patch I define
   the rte_dma_slave_port_parameters refer to Kunpeng DMA implemention.
3) This patch hasn't include doxy related file because failed to generate
   a doc in my environment, could this upstream as a new patch or must
   solved ?

Feedback welcome, thanks

On 2021/7/11 17:25, Chengwen Feng wrote:
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
> 
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---

Jerin Jacob July 11, 2021, 1:34 p.m. UTC | #2

On Sun, Jul 11, 2021 at 3:12 PM fengchengwen <fengchengwen@huawei.com> wrote:
>
> Note:
> 1) This patch hold dmadev <> vchan layer, I think vchan can be very
>    conceptually separated from hw-channel.

I would like to keep it as channel instead of virtual channel as it is
implementation-specific.
No strong opinion on this? @Richardson, Bruce  @Morten Brørup  thoughts

> 2) I could not under struct dpi_dma_queue_ctx_s, so this patch I define
>    the rte_dma_slave_port_parameters refer to Kunpeng DMA implemention.
> 3) This patch hasn't include doxy related file because failed to generate
>    a doc in my environment, could this upstream as a new patch or must
>    solved ?

No IMO. The final version needs to be merged to should have split patch-like,

1) Header file with doxygen comments
2) Multiple patches for implementation as needed
3) Programmer  guide doc

Other items, Typically we will have per new device class.

1) Skelton driver(can use memcpy in this case)
2) app/test-dmadev kind of application.(Can be used to measure
performance and functionality)

>
> Feedback welcome, thanks
>
> On 2021/7/11 17:25, Chengwen Feng wrote:
> > This patch introduce 'dmadevice' which is a generic type of DMA
> > device.
> >
> > The APIs of dmadev library exposes some generic operations which can
> > enable configuration and I/O with the DMA devices.
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > ---
>

Jerin Jacob July 11, 2021, 2:25 p.m. UTC | #3

On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
>
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
>
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
> new file mode 100644
> index 0000000..8779512
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev.h
> @@ -0,0 +1,1030 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + * Copyright(c) 2021 Intel Corporation.
> + * Copyright(c) 2021 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_DMADEV_H_
> +#define _RTE_DMADEV_H_
> +
> +/**
> + * @file rte_dmadev.h
> + *
> + * RTE DMA (Direct Memory Access) device APIs.
> + *
> + * The DMA framework is built on the following model:
> + *
> + *     ---------------   ---------------       ---------------
> + *     | virtual DMA |   | virtual DMA |       | virtual DMA |
> + *     | channel     |   | channel     |       | channel     |
> + *     ---------------   ---------------       ---------------
> + *            |                |                      |
> + *            ------------------                      |
> + *                     |                              |
> + *               ------------                    ------------
> + *               |  dmadev  |                    |  dmadev  |
> + *               ------------                    ------------
> + *                     |                              |
> + *            ------------------               ------------------
> + *            | HW-DMA-channel |               | HW-DMA-channel |
> + *            ------------------               ------------------
> + *                     |                              |
> + *                     --------------------------------
> + *                                     |
> + *                           ---------------------
> + *                           | HW-DMA-Controller |
> + *                           ---------------------
> + *
> + * The DMA controller could have multilpe HW-DMA-channels (aka. HW-DMA-queues),

Typo - multiple

> + * each HW-DMA-channel should be represented by a dmadev.
> + *
> + * The dmadev could create multiple virtual DMA channel, each virtual DMA
> + * channel represents a different transfer context. The DMA operation request
> + * must be submitted to the virtual DMA channel.
> + * E.G. Application could create virtual DMA channel 0 for mem-to-mem transfer
> + *      scenario, and create virtual DMA channel 1 for mem-to-dev transfer
> + *      scenario.
> + *
> + * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the
> + * PCI/SoC device probing phase performed at EAL initialization time. And could
> + * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing
> + * phase.
> + *
> + * We use 'uint16_t dev_id' as the device identifier of a dmadev, and

Please remove "We use" and reword accordingly.

> + * 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev.
> + *
> + * The functions exported by the dmadev API to setup a device designated by its
> + * device identifier must be invoked in the following order:
> + *     - rte_dmadev_configure()
> + *     - rte_dmadev_vchan_setup()
> + *     - rte_dmadev_start()
> + *
> + * Then, the application can invoke dataplane APIs to process jobs.
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_dmadev_configure()), it must call rte_dmadev_stop() first to stop the
> + * device and then do the reconfiguration before calling rte_dmadev_start()
> + * again. The dataplane APIs should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a dmadev by invoking the
> + * rte_dmadev_close() function.
> + *
> + * The dataplane APIs include two parts:
> + *   a) The first part is the submission of operation requests:
> + *        - rte_dmadev_copy()
> + *        - rte_dmadev_copy_sg() - scatter-gather form of copy
> + *        - rte_dmadev_fill()
> + *        - rte_dmadev_fill_sg() - scatter-gather form of fill
> + *        - rte_dmadev_perform() - issue doorbell to hardware
> + *      These APIs could work with different virtual DMA channels which have
> + *      different contexts.
> + *      The first four APIs are used to submit the operation request to the
> + *      virtual DMA channel, if the submission is successful, a uint16_t
> + *      ring_idx is returned, otherwise a negative number is returned.
> + *   b) The second part is to obtain the result of requests:
> + *        - rte_dmadev_completed()
> + *            - return the number of operation requests completed successfully.
> + *        - rte_dmadev_completed_fails()
> + *            - return the number of operation requests failed to complete.
> + *
> + * About the ring_idx which rte_dmadev_copy/copy_sg/fill/fill_sg() returned,
> + * the rules are as follows:
> + *   a) ring_idx for each virtual DMA channel are independent.
> + *   b) For a virtual DMA channel, the ring_idx is monotonically incremented,
> + *      when it reach UINT16_MAX, it wraps back to zero.
> + *   c) The initial ring_idx of a virtual DMA channel is zero, after the device
> + *      is stopped or reset, the ring_idx needs to be reset to zero.
> + *   Example:
> + *      step-1: start one dmadev
> + *      step-2: enqueue a copy operation, the ring_idx return is 0
> + *      step-3: enqueue a copy operation again, the ring_idx return is 1
> + *      ...
> + *      step-101: stop the dmadev
> + *      step-102: start the dmadev
> + *      step-103: enqueue a copy operation, the cookie return is 0
> + *      ...
> + *      step-x+0: enqueue a fill operation, the ring_idx return is 65535
> + *      step-x+1: enqueue a copy operation, the ring_idx return is 0
> + *      ...
> + *
> + * By default, all the non-dataplane functions of the dmadev API exported by a
> + * PMD are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object.
> + *
> + * The dataplane functions of the dmadev API exported by a PMD can be MT-safe
> + * only when supported by the driver, generally, the driver will reports two
> + * capabilities:
> + *   a) Whether to support MT-safe for the submit/completion API of the same
> + *      virtual DMA channel.
> + *      E.G. one thread do submit operation, another thread do completion
> + *           operation.
> + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VCHAN.
> + *      If driver don't support it, it's up to the application to guarantee
> + *      MT-safe.
> + *   b) Whether to support MT-safe for different virtual DMA channels.
> + *      E.G. one thread do operation on virtual DMA channel 0, another thread
> + *           do operation on virtual DMA channel 1.
> + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
> + *      If driver don't support it, it's up to the application to guarantee
> + *      MT-safe.

The above cases make it difficult to write applications . Also,
application locking using spinlock
etc may not better optimization.(If an driver can do it better way).
IMO, As discussed with @Morten Brørup ,
I think, it is better while configuring the channel application can
specify, do they  "need" MT safe to
have a portable application.Unlike network or crypto device, Since we
are creating the virtual channel
such scheme is useful.


> + *
> + */
> +
> +#include <rte_common.h>
> +#include <rte_compat.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#define RTE_DMADEV_NAME_MAX_LEN        RTE_DEV_NAME_MAX_LEN
> +
> +extern int rte_dmadev_logtype;
> +

Missing Doxygen comment

> +#define RTE_DMADEV_LOG(level, ...) \
> +       rte_log(RTE_LOG_ ## level, rte_dmadev_logtype, "" __VA_ARGS__)
> +

Make it as internal.

> +/* Macros to check for valid port */
> +#define RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, retval) do { \
> +       if (!rte_dmadev_is_valid_dev(dev_id)) { \
> +               RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
> +               return retval; \
> +       } \
> +} while (0)
> +

Make it as internal.

> +#define RTE_DMADEV_VALID_DEV_ID_OR_RET(dev_id) do { \
> +       if (!rte_dmadev_is_valid_dev(dev_id)) { \
> +               RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
> +               return; \
> +       } \
> +} while (0)
> +
> +/**
> + * @internal

We can make it as a public API.

> + * Validate if the DMA device index is a valid attached DMA device.
> + *
> + * @param dev_id
> + *   DMA device index.
> + *
> + * @return
> + *   - If the device index is valid (true) or not (false).
> + */
> +__rte_internal
> +bool
> +rte_dmadev_is_valid_dev(uint16_t dev_id);
> +
> +/**
> + * rte_dma_sg - can hold scatter DMA operation request
> + */
> +struct rte_dma_sg {
> +       rte_iova_t src;
> +       rte_iova_t dst;

IMO, it should be only rte_iova_t addr. See comment at  _sg API

> +       uint32_t length;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the total number of DMA devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable DMA devices.
> + */
> +__rte_experimental
> +uint16_t
> +rte_dmadev_count(void);
> +
> +/**
> + * The capabilities of a DMA device
> + */
> +#define RTE_DMA_DEV_CAPA_MEM_TO_MEM    (1ull << 0)
> +/**< DMA device support mem-to-mem transfer.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_MEM_TO_DEV    (1ull << 1)
> +/**< DMA device support slave mode & mem-to-dev transfer.

Do we need to say slave mode? Just mem to dev is fine. Right?

> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_DEV_TO_MEM    (1ull << 2)
> +/**< DMA device support slave mode & dev-to-mem transfer.

See above.

> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_DEV_TO_DEV    (1ull << 3)
> +/**< DMA device support slave mode & dev-to-dev transfer.

See above.

> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_OPS_COPY      (1ull << 4)
> +/**< DMA device support copy ops.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_OPS_FILL      (1ull << 5)
> +/**< DMA device support fill ops.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_OPS_SG                (1ull << 6)
> +/**< DMA device support scatter-list ops.
> + * If device support ops_copy and ops_sg, it means supporting copy_sg ops.
> + * If device support ops_fill and ops_sg, it means supporting fill_sg ops.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_FENCE         (1ull << 7)
> +/**< DMA device support fence.
> + * If device support fence, then application could set a fence flags when
> + * enqueue operation by rte_dma_copy/copy_sg/fill/fill_sg.
> + * If a operation has a fence flags, it means the operation must be processed
> + * only after all previous operations are completed.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_SVA           (1ull << 8)
> +/**< DMA device support SVA which could use VA as DMA address.
> + * If device support SVA then application could pass any VA address like memory
> + * from rte_malloc(), rte_memzone(), malloc, stack memory.
> + * If device don't support SVA, then application should pass IOVA address which
> + * from rte_malloc(), rte_memzone().
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_MT_VCHAN      (1ull << 9)
> +/**< DMA device support MT-safe of a virtual DMA channel.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN        (1ull << 10)
> +/**< DMA device support MT-safe of different virtual DMA channels.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +
> +/**
> + * A structure used to retrieve the contextual information of
> + * an DMA device
> + */
> +struct rte_dmadev_info {
> +       struct rte_device *device; /**< Generic Device information */
> +       uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
> +       /** Maximum number of virtual DMA channels supported */
> +       uint16_t max_vchans;

Doxygen comment should come after the symbol. Not above.

> +       /** Maximum allowed number of virtual DMA channel descriptors */
> +       uint16_t max_desc;
> +       /** Minimum allowed number of virtual DMA channel descriptors */
> +       uint16_t min_desc;
> +       uint16_t nb_vchans; /**< Number of virtual DMA channel configured */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Retrieve the contextual information of a DMA device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_dmadev_info* to be filled with the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - =0: Success, driver updates the contextual information of the DMA device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +__rte_experimental
> +int
> +rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info);
> +
> +/**
> + * A structure used to configure a DMA device.
> + */
> +struct rte_dmadev_conf {
> +       /** Maximum number of virtual DMA channel to use.
> +        * This value cannot be greater than the field 'max_vchans' of struct
> +        * rte_dmadev_info which get from rte_dmadev_info_get().
> +        */
> +       uint16_t max_vchans;
> +       /** Enable bit for MT-safe of a virtual DMA channel.
> +        * This bit can be enabled only when the device supports
> +        * RTE_DMA_DEV_CAPA_MT_VCHAN.
> +        * @see RTE_DMA_DEV_CAPA_MT_VCHAN
> +        */
> +       uint8_t enable_mt_vchan : 1;
> +       /** Enable bit for MT-safe of different virtual DMA channels.
> +        * This bit can be enabled only when the device supports
> +        * RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
> +        * @see RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN

I think, we can support this even if the flag is not supported to limit the
application fastpath options.

> +        */
> +       uint8_t enable_mt_multi_vchan : 1;
> +       uint64_t reserved[2]; /**< Reserved for future fields */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Configure a DMA device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param dev_conf
> + *   The DMA device configuration structure encapsulated into rte_dmadev_conf
> + *   object.
> + *
> + * @return
> + *   - =0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Start a DMA device.
> + *
> + * The device start step is the last one and consists of setting the DMA
> + * to start accepting jobs.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - =0: Success, device started.
> + *   - <0: Error code returned by the driver start function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_start(uint16_t dev_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Stop a DMA device.
> + *
> + * The device can be restarted with a call to rte_dmadev_start()
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - =0: Success, device stopped.
> + *   - <0: Error code returned by the driver stop function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_stop(uint16_t dev_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Close a DMA device.
> + *
> + * The device cannot be restarted after this call.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *  - =0: Successfully close device
> + *  - <0: Failure to close device
> + */
> +__rte_experimental
> +int
> +rte_dmadev_close(uint16_t dev_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Reset a DMA device.
> + *
> + * This is different from cycle of rte_dmadev_start->rte_dmadev_stop in the
> + * sense similar to hard or soft reset.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - =0: Successfully reset device.
> + *   - <0: Failure to reset device.
> + *   - (-ENOTSUP): If the device doesn't support this function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_reset(uint16_t dev_id);

Is this required now?

> +
> +/**
> + * DMA transfer direction defines.
> + */
> +#define RTE_DMA_MEM_TO_MEM     (1ull << 0)

RTE_DMA_DIRECTION_...

> +/**< DMA transfer direction - from memory to memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_MEM_TO_DEV     (1ull << 1)
> +/**< DMA transfer direction - slave mode & from memory to device.
> + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> + * request from ARM memory to x86 host memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_DEV_TO_MEM     (1ull << 2)
> +/**< DMA transfer direction - slave mode & from device to memory.
> + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> + * request from x86 host memory to ARM memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_DEV_TO_DEV     (1ull << 3)
> +/**< DMA transfer direction - slave mode & from device to device.
> + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> + * this case, the ARM SoCs works in slave mode, it could initiate a DMA m> + * request from x86 host memory to another x86 host memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_TRANSFER_DIR_ALL       (RTE_DMA_MEM_TO_MEM | \
> +                                        RTE_DMA_MEM_TO_DEV | \
> +                                        RTE_DMA_DEV_TO_MEM | \
> +                                        RTE_DMA_DEV_TO_DEV)
> +
> +/**
> + * enum rte_dma_slave_port_type - slave mode type defines
> + */
> +enum rte_dma_slave_port_type {

I think, rte_dmadev_dev_type

> +       /** The slave port is PCIE. */
> +       RTE_DMA_SLAVE_PORT_PCIE = 1,
> +};
> +
> +/**
> + * A structure used to descript slave port parameters.*
> + */
> +struct rte_dma_slave_port_parameters {

May be rte_dmadev_dev_conf?

> +       enum rte_dma_slave_port_type port_type;
> +       union {
> +               /** For PCIE port */
> +               struct {
> +                       /** The physical function number which to use */
> +                       uint64_t pf_number : 6;
> +                       /** Virtual function enable bit */
> +                       uint64_t vf_enable : 1;
> +                       /** The virtual function number which to use */
> +                       uint64_t vf_number : 8;
> +                       uint64_t pasid : 20;
> +                       /** The attributes filed in TLP packet */
> +                       uint64_t tlp_attr : 3;
> +               };
> +       };
> +};
> +
> +/**
> + * A structure used to configure a virtual DMA channel.
> + */
> +struct rte_dmadev_vchan_conf {
> +       uint8_t direction; /**< Set of supported transfer directions */
We could add @see RTE_DMA_DIRECTION_*.
Also, say, how the application can know the valid flags.aka point to info.


> +       /** Number of descriptor for the virtual DMA channel */
> +       uint16_t nb_desc;
> +       /** 1) Used to describes the dev parameter in the mem-to-dev/dev-to-mem
> +        * transfer scenario.
> +        * 2) Used to describes the src dev parameter in the dev-to-dev
> +        * transfer scenario.
> +        */
> +       struct rte_dma_slave_port_parameters port;
> +       /** Used to describes the dst dev parameters in the dev-to-dev
> +        * transfer scenario.
> +        */
> +       struct rte_dma_slave_port_parameters peer_port;
> +       uint64_t reserved[2]; /**< Reserved for future fields */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate and set up a virtual DMA channel.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param conf
> + *   The virtual DMA channel configuration structure encapsulated into
> + *   rte_dmadev_vchan_conf object.
> + *
> + * @return
> + *   - >=0: Allocate success, it is the virtual DMA channel id. This value must
> + *          be less than the field 'max_vchans' of struct rte_dmadev_conf
> +           which configured by rte_dmadev_configure().
> + *   - <0: Error code returned by the driver virtual channel setup function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_vchan_setup(uint16_t dev_id,
> +                      const struct rte_dmadev_vchan_conf *conf);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Release a virtual DMA channel.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel which return by vchan setup.
> + *
> + * @return
> + *   - =0: Successfully release the virtual DMA channel.
> + *   - <0: Error code returned by the driver virtual channel release function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);

We are not making release as pubic API in other device class. See ethdev spec.


> +
> +/**
> + * rte_dmadev_stats - running statistics.
> + */
> +struct rte_dmadev_stats {
> +       /** Count of operations which were successfully enqueued */
> +       uint64_t enqueued_count;
> +       /** Count of operations which were submitted to hardware */
> +       uint64_t submitted_count;
> +       /** Count of operations which failed to complete */
> +       uint64_t completed_fail_count;
> +       /** Count of operations which successfully complete */
> +       uint64_t completed_count;
> +       uint64_t reserved[4]; /**< Reserved for future fields */
> +};

Please add the capability for each counter in info structure as one
device may support all
the counters.

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Retrieve basic statistics of a or all virtual DMA channel(s).
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel, -1 means all channels.
> + * @param[out] stats
> + *   The basic statistics structure encapsulated into rte_dmadev_stats
> + *   object.
> + *
> + * @return
> + *   - =0: Successfully retrieve stats.
> + *   - <0: Failure to retrieve stats.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_stats_get(uint16_t dev_id, int vchan,
> +                    struct rte_dmadev_stats *stats);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Reset basic statistics of a or all virtual DMA channel(s).
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel, -1 means all channels.
> + *
> + * @return
> + *   - =0: Successfully reset stats.
> + *   - <0: Failure to reset stats.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_stats_reset(uint16_t dev_id, int vchan);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Dump DMA device info.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param f
> + *   The file to write the output to.
> + *
> + * @return
> + *   0 on success. Non-zero otherwise.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_dump(uint16_t dev_id, FILE *f);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Trigger the dmadev self test.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - 0: Selftest successful.
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_selftest(uint16_t dev_id);
> +
> +#include "rte_dmadev_core.h"
> +
> +/**
> + *  DMA flags to augment operation preparation.
> + *  Used as the 'flags' parameter of rte_dmadev_copy/copy_sg/fill/fill_sg.
> + */
> +#define RTE_DMA_FLAG_FENCE     (1ull << 0)

RTE_DMA_OP_FLAG_FENCE

Please add also add RTE_DMA_OP_FLAG_SUMBIT as we discussed in another thread.
We can support submit based on the flag and _submit() version based on
application preference.

> +/**< DMA fence flag
> + * It means the operation with this flag must be processed only after all
> + * previous operations are completed.
> + *
> + * @see rte_dmadev_copy()
> + * @see rte_dmadev_copy_sg()
> + * @see rte_dmadev_fill()
> + * @see rte_dmadev_fill_sg()
> + */
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a copy operation onto the virtual DMA channel.
> + *
> + * This queues up a copy operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param src
> + *   The address of the source buffer.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the data to be copied.
> + * @param flags
> + *   An flags for this operation.

See RTE_DMA_OP_FLAG_*

> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
> +               uint32_t length, uint64_t flags)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->copy, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->copy)(dev, vchan, src, dst, length, flags)> +
> +/**

> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a scatter list copy operation onto the virtual DMA channel.
> + *
> + * This queues up a scatter list copy operation to be performed by hardware,
> + * but does not trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param sg
> + *   The pointer of scatterlist.
> + * @param sg_len
> + *   The number of scatterlist elements.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
> +                  uint32_t sg_len, uint64_t flags)


As requested earlier, I prefer to have

rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct
rte_dma_sg *src, uint32_t nb_src, const struct rte_dma_sg *dst,
uint32_t nb_dst, uint64_t flags)


> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(sg, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->copy_sg, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->copy_sg)(dev, vchan, sg, sg_len, flags);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a fill operation onto the virtual DMA channel.
> + *
> + * This queues up a fill operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param pattern
> + *   The pattern to populate the destination buffer with.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the destination buffer.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> +               rte_iova_t dst, uint32_t length, uint64_t flags)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->fill)(dev, vchan, pattern, dst, length, flags);

Instead of every driver set the NOP function, In the common code, If
the CAPA is not set,
common code can set NOP function for this with <0 return value.

> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a scatter list fill operation onto the virtual DMA channel.
> + *
> + * This queues up a scatter list fill operation to be performed by hardware,
> + * but does not trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param pattern
> + *   The pattern to populate the destination buffer with.
> + * @param sg
> + *   The pointer of scatterlist.
> + * @param sg_len
> + *   The number of scatterlist elements.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_fill_sg(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> +                  const struct rte_dma_sg *sg, uint32_t sg_len,
> +                  uint64_t flags)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(sg, -ENOTSUP);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->fill_sg)(dev, vchan, pattern, sg, sg_len, flags);

In order to save 8B in rte_dmadev this API can be removed as looks like none
of the drivers supports fill in sg mode.

> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Trigger hardware to begin performing enqueued operations.
> + *
> + * This API is used to write the "doorbell" to the hardware to trigger it
> + * to begin the operations previously enqueued by rte_dmadev_copy/fill()
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + *
> + * @return
> + *   - =0: Successfully trigger hardware.
> + *   - <0: Failure to trigger hardware.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_submit(uint16_t dev_id, uint16_t vchan)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->submit, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->submit)(dev, vchan);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that have been successfully completed.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param nb_cpls
> + *   The maximum number of completed operations that can be processed.
> + * @param[out] last_idx
> + *   The last completed operation's index.
> + *   If not required, NULL can be passed in.
> + * @param[out] has_error
> + *   Indicates if there are transfer error.
> + *   If not required, NULL can be passed in.
> + *
> + * @return
> + *   The number of operations that successfully completed.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
> +                    uint16_t *last_idx, bool *has_error)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       uint16_t idx;
> +       bool err;
> +
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->completed, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +       if (nb_cpls == 0) {
> +               RTE_DMADEV_LOG(ERR, "Invalid nb_cpls\n");
> +               return -EINVAL;
> +       }
> +#endif
> +
> +       /* Ensure the pointer values are non-null to simplify drivers.
> +        * In most cases these should be compile time evaluated, since this is
> +        * an inline function.
> +        * - If NULL is explicitly passed as parameter, then compiler knows the
> +        *   value is NULL
> +        * - If address of local variable is passed as parameter, then compiler
> +        *   can know it's non-NULL.
> +        */
> +       if (last_idx == NULL)
> +               last_idx = &idx;
> +       if (has_error == NULL)
> +               has_error = &err;
> +
> +       *has_error = false;
> +       return (*dev->completed)(dev, vchan, nb_cpls, last_idx, has_error);
> +}
> +
> +/**
> + * DMA transfer status code defines
> + */
> +enum rte_dma_status_code {
> +       /** The operation completed successfully */
> +       RTE_DMA_STATUS_SUCCESSFUL = 0,
> +       /** The operation failed to complete due active drop
> +        * This is mainly used when processing dev_stop, allow outstanding
> +        * requests to be completed as much as possible.
> +        */
> +       RTE_DMA_STATUS_ACTIVE_DROP,
> +       /** The operation failed to complete due invalid source address */
> +       RTE_DMA_STATUS_INVALID_SRC_ADDR,
> +       /** The operation failed to complete due invalid destination address */
> +       RTE_DMA_STATUS_INVALID_DST_ADDR,
> +       /** The operation failed to complete due invalid length */
> +       RTE_DMA_STATUS_INVALID_LENGTH,
> +       /** The operation failed to complete due invalid opcode
> +        * The DMA descriptor could have multiple format, which are
> +        * distinguished by the opcode field.
> +        */
> +       RTE_DMA_STATUS_INVALID_OPCODE,
> +       /** The operation failed to complete due bus err */
> +       RTE_DMA_STATUS_BUS_ERROR,
> +       /** The operation failed to complete due data poison */
> +       RTE_DMA_STATUS_DATA_POISION,
> +       /** The operation failed to complete due descriptor read error */
> +       RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR,
> +       /** The operation failed to complete due device link error
> +        * Used to indicates that the link error in the mem-to-dev/dev-to-mem/
> +        * dev-to-dev transfer scenario.
> +        */
> +       RTE_DMA_STATUS_DEV_LINK_ERROR,
> +       /** Driver specific status code offset
> +        * Start status code for the driver to define its own error code.


RTE_DMA_STATUS_UNKNOWN for the ones which are not added in public API spec.


> +        */
> +       RTE_DMA_STATUS_DRV_SPECIFIC_OFFSET = 0x10000,
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that failed to complete.
> + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param nb_status
> + *   Indicates the size of status array.
> + * @param[out] status
> + *   The error code of operations that failed to complete.
> + *   Some standard error code are described in 'enum rte_dma_status_code'
> + *   @see rte_dma_status_code
> + * @param[out] last_idx
> + *   The last failed completed operation's index.
> + *
> + * @return
> + *   The number of operations that failed to complete.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vchan,
> +                          const uint16_t nb_status, uint32_t *status,

uint32_t -> enum rte_dma_status_code


> +                          uint16_t *last_idx)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(status, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(last_idx, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_fails, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +       if (nb_status == 0) {
> +               RTE_DMADEV_LOG(ERR, "Invalid nb_status\n");
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->completed_fails)(dev, vchan, nb_status, status, last_idx);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_DMADEV_H_ */
> diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
> new file mode 100644
> index 0000000..410faf0
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev_core.h
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + * Copyright(c) 2021 Intel Corporation.
> + */
> +
> +#ifndef _RTE_DMADEV_CORE_H_
> +#define _RTE_DMADEV_CORE_H_
> +
> +/**
> + * @file
> + *
> + * RTE DMA Device internal header.
> + *
> + * This header contains internal data types, that are used by the DMA devices
> + * in order to expose their ops to the class.
> + *
> + * Applications should not use these API directly.
> + *
> + */
> +
> +struct rte_dmadev;
> +
> +/** @internal Used to get device information of a device. */
> +typedef int (*dmadev_info_get_t)(struct rte_dmadev *dev,
> +                                struct rte_dmadev_info *dev_info);

Please change to rte_dmadev_info_get_t to avoid conflict due to namespace issue
as this header is exported.

> +
> +/** @internal Used to configure a device. */
> +typedef int (*dmadev_configure_t)(struct rte_dmadev *dev,
> +                                 const struct rte_dmadev_conf *dev_conf);
> +
> +/** @internal Used to start a configured device. */
> +typedef int (*dmadev_start_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to stop a configured device. */
> +typedef int (*dmadev_stop_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to close a configured device. */
> +typedef int (*dmadev_close_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to reset a configured device. */
> +typedef int (*dmadev_reset_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to allocate and set up a virtual DMA channel. */
> +typedef int (*dmadev_vchan_setup_t)(struct rte_dmadev *dev,
> +                                   const struct rte_dmadev_vchan_conf *conf);
> +
> +/** @internal Used to release a virtual DMA channel. */
> +typedef int (*dmadev_vchan_release_t)(struct rte_dmadev *dev, uint16_t vchan);
> +
> +/** @internal Used to retrieve basic statistics. */
> +typedef int (*dmadev_stats_get_t)(struct rte_dmadev *dev, int vchan,
> +                                 struct rte_dmadev_stats *stats);
> +
> +/** @internal Used to reset basic statistics. */
> +typedef int (*dmadev_stats_reset_t)(struct rte_dmadev *dev, int vchan);
> +
> +/** @internal Used to dump internal information. */
> +typedef int (*dmadev_dump_t)(struct rte_dmadev *dev, FILE *f);
> +
> +/** @internal Used to start dmadev selftest. */
> +typedef int (*dmadev_selftest_t)(uint16_t dev_id);
> +
> +/** @internal Used to enqueue a copy operation. */
> +typedef int (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vchan,
> +                            rte_iova_t src, rte_iova_t dst,
> +                            uint32_t length, uint64_t flags);
> +
> +/** @internal Used to enqueue a scatter list copy operation. */
> +typedef int (*dmadev_copy_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
> +                               const struct rte_dma_sg *sg,
> +                               uint32_t sg_len, uint64_t flags);
> +
> +/** @internal Used to enqueue a fill operation. */
> +typedef int (*dmadev_fill_t)(struct rte_dmadev *dev, uint16_t vchan,
> +                            uint64_t pattern, rte_iova_t dst,
> +                            uint32_t length, uint64_t flags);
> +
> +/** @internal Used to enqueue a scatter list fill operation. */
> +typedef int (*dmadev_fill_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
> +                       uint64_t pattern, const struct rte_dma_sg *sg,
> +                       uint32_t sg_len, uint64_t flags);
> +
> +/** @internal Used to trigger hardware to begin working. */
> +typedef int (*dmadev_submit_t)(struct rte_dmadev *dev, uint16_t vchan);
> +
> +/** @internal Used to return number of successful completed operations. */
> +typedef uint16_t (*dmadev_completed_t)(struct rte_dmadev *dev, uint16_t vchan,
> +                                      const uint16_t nb_cpls,
> +                                      uint16_t *last_idx, bool *has_error);
> +
> +/** @internal Used to return number of failed completed operations. */
> +typedef uint16_t (*dmadev_completed_fails_t)(struct rte_dmadev *dev,
> +                       uint16_t vchan, const uint16_t nb_status,
> +                       uint32_t *status, uint16_t *last_idx);
> +
> +/**
> + * DMA device operations function pointer table
> + */
> +struct rte_dmadev_ops {
> +       dmadev_info_get_t dev_info_get;
> +       dmadev_configure_t dev_configure;
> +       dmadev_start_t dev_start;
> +       dmadev_stop_t dev_stop;
> +       dmadev_close_t dev_close;
> +       dmadev_reset_t dev_reset;
> +       dmadev_vchan_setup_t vchan_setup;
> +       dmadev_vchan_release_t vchan_release;
> +       dmadev_stats_get_t stats_get;
> +       dmadev_stats_reset_t stats_reset;
> +       dmadev_dump_t dev_dump;
> +       dmadev_selftest_t dev_selftest;
> +};
> +
> +/**
> + * @internal
> + * The data part, with no function pointers, associated with each DMA device.
> + *
> + * This structure is safe to place in shared memory to be common among different
> + * processes in a multi-process configuration.
> + */
> +struct rte_dmadev_data {
> +       uint16_t dev_id; /**< Device [external] identifier. */
> +       char dev_name[RTE_DMADEV_NAME_MAX_LEN]; /**< Unique identifier name */
> +       void *dev_private; /**< PMD-specific private data. */
> +       struct rte_dmadev_conf dev_conf; /**< DMA device configuration. */
> +       uint8_t dev_started : 1; /**< Device state: STARTED(1)/STOPPED(0). */
> +       uint64_t reserved[4]; /**< Reserved for future fields */
> +} __rte_cache_aligned;
> +
> +/**
> + * @internal
> + * The generic data structure associated with each DMA device.
> + *
> + * The dataplane APIs are located at the beginning of the structure, along
> + * with the pointer to where all the data elements for the particular device
> + * are stored in shared memory. This split scheme allows the function pointer
> + * and driver data to be per-process, while the actual configuration data for
> + * the device is shared.
> + */
> +struct rte_dmadev {
> +       dmadev_copy_t copy;
> +       dmadev_copy_sg_t copy_sg;
> +       dmadev_fill_t fill;
> +       dmadev_fill_sg_t fill_sg;
> +       dmadev_submit_t submit;
> +       dmadev_completed_t completed;

We could add reserved here for any fastpath future additions.

> +       dmadev_completed_fails_t completed_fails;


> +       const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD. */
> +       /** Flag indicating the device is attached: ATTACHED(1)/DETACHED(0). */
> +       uint8_t attached : 1;
> +       /** Device info which supplied during device initialization. */
> +       struct rte_device *device;
> +       struct rte_dmadev_data *data; /**< Pointer to device data. */
> +       uint64_t reserved[4]; /**< Reserved for future fields */
> +} __rte_cache_aligned;
> +
> +extern struct rte_dmadev rte_dmadevices[];
> +
> +#endif /* _RTE_DMADEV_CORE_H_ */
> diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h
> new file mode 100644
> index 0000000..45141f9
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev_pmd.h
> @@ -0,0 +1,72 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + */
> +
> +#ifndef _RTE_DMADEV_PMD_H_
> +#define _RTE_DMADEV_PMD_H_
> +
> +/**
> + * @file
> + *
> + * RTE DMA Device PMD APIs
> + *
> + * Driver facing APIs for a DMA device. These are not to be called directly by
> + * any application.
> + */
> +
> +#include "rte_dmadev.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @internal
> + * Allocates a new dmadev slot for an DMA device and returns the pointer
> + * to that slot for the driver to use.
> + *
> + * @param name
> + *   DMA device name.
> + *
> + * @return
> + *   A pointer to the DMA device slot case of success,
> + *   NULL otherwise.
> + */
> +__rte_internal
> +struct rte_dmadev *
> +rte_dmadev_pmd_allocate(const char *name);
> +
> +/**
> + * @internal
> + * Release the specified dmadev.
> + *
> + * @param dev
> + *   Device to be released.
> + *
> + * @return
> + *   - 0 on success, negative on error
> + */
> +__rte_internal
> +int
> +rte_dmadev_pmd_release(struct rte_dmadev *dev);
> +
> +/**
> + * @internal
> + * Return the DMA device based on the device name.
> + *
> + * @param name
> + *   DMA device name.
> + *
> + * @return
> + *   A pointer to the DMA device slot case of success,
> + *   NULL otherwise.
> + */
> +__rte_internal
> +struct rte_dmadev *
> +rte_dmadev_get_device_by_name(const char *name);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_DMADEV_PMD_H_ */
> diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map
> new file mode 100644
> index 0000000..0f099e7
> --- /dev/null
> +++ b/lib/dmadev/version.map
> @@ -0,0 +1,40 @@
> +EXPERIMENTAL {
> +       global:
> +
> +       rte_dmadev_count;
> +       rte_dmadev_info_get;
> +       rte_dmadev_configure;
> +       rte_dmadev_start;
> +       rte_dmadev_stop;
> +       rte_dmadev_close;
> +       rte_dmadev_reset;
> +       rte_dmadev_vchan_setup;
> +       rte_dmadev_vchan_release;
> +       rte_dmadev_stats_get;
> +       rte_dmadev_stats_reset;
> +       rte_dmadev_dump;
> +       rte_dmadev_selftest;
> +       rte_dmadev_copy;
> +       rte_dmadev_copy_sg;
> +       rte_dmadev_fill;
> +       rte_dmadev_fill_sg;
> +       rte_dmadev_submit;
> +       rte_dmadev_completed;
> +       rte_dmadev_completed_fails;
> +
> +       local: *;
> +};
> +
> +INTERNAL {
> +        global:
> +
> +       rte_dmadevices;
> +       rte_dmadev_pmd_allocate;
> +       rte_dmadev_pmd_release;
> +       rte_dmadev_get_device_by_name;
> +
> +       local:
> +
> +       rte_dmadev_is_valid_dev;
> +};
> +
> diff --git a/lib/meson.build b/lib/meson.build
> index 1673ca4..68d239f 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -60,6 +60,7 @@ libraries = [
>          'bpf',
>          'graph',
>          'node',
> +        'dmadev',
>  ]
>
>  if is_windows
> --
> 2.8.1
>

Morten Brørup July 12, 2021, 7:15 a.m. UTC | #4

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chengwen Feng
> 
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
> 
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

[snip]

> diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
> new file mode 100644
> index 0000000..8779512
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev.h
> @@ -0,0 +1,1030 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + * Copyright(c) 2021 Intel Corporation.
> + * Copyright(c) 2021 Marvell International Ltd.

If the group of DMA device hardware vendors don't oppose, I would appreciate it if my contribution to the definition of the DMA device API is recognized in this file:

+ * Copyright(c) 2021 SmartShare Systems.

> + */
> +

Also remember the other contributors in the other files, where appropriate.

-Morten

Morten Brørup July 12, 2021, 7:40 a.m. UTC | #5

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jerin Jacob
> 
> On Sun, Jul 11, 2021 at 3:12 PM fengchengwen <fengchengwen@huawei.com>
> wrote:
> >
> > Note:
> > 1) This patch hold dmadev <> vchan layer, I think vchan can be very
> >    conceptually separated from hw-channel.
> 
> I would like to keep it as channel instead of virtual channel as it is
> implementation-specific.
> No strong opinion on this? @Richardson, Bruce  @Morten Brørup  thoughts

Consider using "context" or "ctx" instead. I might help avoid being mistaken as a DMA hardware feature.

No strong opinion, though. Whatever fits nicely in the documentation is a good choice.

A small anecdote: We once had to name an important parameter in our product, and it came down to the choice of two perfectly good names. We chose the shortest name, only because it fit better into the GUI.

> 
> > 2) I could not under struct dpi_dma_queue_ctx_s, so this patch I
> define
> >    the rte_dma_slave_port_parameters refer to Kunpeng DMA
> implemention.

One more thing:

The DPDK project is aiming to use inclusive code language. (Ref: https://www.dpdk.org/blog/2020/07/22/dpdk-governing-board-update-july-2020/)

Please replace the word "slave" with something politically correct.

-Morten

Jerin Jacob July 12, 2021, 9:59 a.m. UTC | #6

On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
>
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
>
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a fill operation onto the virtual DMA channel.
> + *
> + * This queues up a fill operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param pattern
> + *   The pattern to populate the destination buffer with.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the destination buffer.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.

fill job

> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> +               rte_iova_t dst, uint32_t length, uint64_t flags)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +#endif
> +       return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> +}
> +

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that have been successfully completed.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param nb_cpls
> + *   The maximum number of completed operations that can be processed.
> + * @param[out] last_idx
> + *   The last completed operation's index.
> + *   If not required, NULL can be passed in.

This means the driver will be tracking the last index.

Is that mean, the application needs to call this API periodically to
consume the completion slot.
I.e up to 64K (UINT16_MAX)  outstanding jobs are possible. If the
application fails to call this
>64K outstand job then the subsequence enqueue will fail.

If so, we need to document this.

One of the concerns of keeping UINT16_MAX as the limit is the
completion memory will always not in cache.
On the other hand, if we make this size programmable. it may introduce
complexity in the application.

Thoughts?


> + * @param[out] has_error
> + *   Indicates if there are transfer error.
> + *   If not required, NULL can be passed in.
> + *
> + * @return
> + *   The number of operations that successfully completed.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
> +                    uint16_t *last_idx, bool *has_error)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       uint16_t idx;
> +       bool err;
> +
> +#ifdef RTE_DMADEV_DEBUG
> +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +       RTE_FUNC_PTR_OR_ERR_RET(*dev->completed, -ENOTSUP);
> +       if (vchan >= dev->data->dev_conf.max_vchans) {
> +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +               return -EINVAL;
> +       }
> +       if (nb_cpls == 0) {
> +               RTE_DMADEV_LOG(ERR, "Invalid nb_cpls\n");
> +               return -EINVAL;
> +       }
> +#endif
> +
> +       /* Ensure the pointer values are non-null to simplify drivers.
> +        * In most cases these should be compile time evaluated, since this is
> +        * an inline function.
> +        * - If NULL is explicitly passed as parameter, then compiler knows the
> +        *   value is NULL
> +        * - If address of local variable is passed as parameter, then compiler
> +        *   can know it's non-NULL.
> +        */
> +       if (last_idx == NULL)
> +               last_idx = &idx;
> +       if (has_error == NULL)
> +               has_error = &err;
> +
> +       *has_error = false;
> +       return (*dev->completed)(dev, vchan, nb_cpls, last_idx, has_error);
> +}

Bruce Richardson July 12, 2021, 12:05 p.m. UTC | #7

On Sun, Jul 11, 2021 at 05:25:56PM +0800, Chengwen Feng wrote:
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
> 
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

Thanks for this V2.
Some initial (mostly minor) comments on the meson.build and dmadev .c file
below. I'll review the headers in a separate email.

/Bruce

> ---
>  MAINTAINERS                  |    4 +
>  config/rte_config.h          |    3 +
>  lib/dmadev/meson.build       |    6 +
>  lib/dmadev/rte_dmadev.c      |  560 +++++++++++++++++++++++
>  lib/dmadev/rte_dmadev.h      | 1030 ++++++++++++++++++++++++++++++++++++++++++
>  lib/dmadev/rte_dmadev_core.h |  159 +++++++
>  lib/dmadev/rte_dmadev_pmd.h  |   72 +++
>  lib/dmadev/version.map       |   40 ++
>  lib/meson.build              |    1 +
>  9 files changed, 1875 insertions(+)
>  create mode 100644 lib/dmadev/meson.build
>  create mode 100644 lib/dmadev/rte_dmadev.c
>  create mode 100644 lib/dmadev/rte_dmadev.h
>  create mode 100644 lib/dmadev/rte_dmadev_core.h
>  create mode 100644 lib/dmadev/rte_dmadev_pmd.h
>  create mode 100644 lib/dmadev/version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4347555..0595239 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -496,6 +496,10 @@ F: drivers/raw/skeleton/
>  F: app/test/test_rawdev.c
>  F: doc/guides/prog_guide/rawdev.rst
>  
> +DMA device API - EXPERIMENTAL
> +M: Chengwen Feng <fengchengwen@huawei.com>
> +F: lib/dmadev/
> +
>  
>  Memory Pool Drivers
>  -------------------
> diff --git a/config/rte_config.h b/config/rte_config.h
> index 590903c..331a431 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -81,6 +81,9 @@
>  /* rawdev defines */
>  #define RTE_RAWDEV_MAX_DEVS 64
>  
> +/* dmadev defines */
> +#define RTE_DMADEV_MAX_DEVS 64
> +
>  /* ip_fragmentation defines */
>  #define RTE_LIBRTE_IP_FRAG_MAX_FRAG 4
>  #undef RTE_LIBRTE_IP_FRAG_TBL_STAT
> diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build
> new file mode 100644
> index 0000000..c918dae
> --- /dev/null
> +++ b/lib/dmadev/meson.build
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2021 HiSilicon Limited.
> +
> +sources = files('rte_dmadev.c')
> +headers = files('rte_dmadev.h', 'rte_dmadev_pmd.h')

If rte_dmadev_pmd.h is only for PMD use, then it should be in
"driver_sdk_headers".

> +indirect_headers += files('rte_dmadev_core.h')
> diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c
> new file mode 100644
> index 0000000..8a29abb
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev.c
> @@ -0,0 +1,560 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + * Copyright(c) 2021 Intel Corporation.
> + */
> +
> +#include <ctype.h>
> +#include <inttypes.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include <rte_debug.h>
> +#include <rte_dev.h>
> +#include <rte_eal.h>
> +#include <rte_errno.h>
> +#include <rte_lcore.h>
> +#include <rte_log.h>
> +#include <rte_memory.h>
> +#include <rte_memzone.h>
> +#include <rte_malloc.h>
> +#include <rte_string_fns.h>
> +
> +#include "rte_dmadev.h"
> +#include "rte_dmadev_pmd.h"
> +
> +RTE_LOG_REGISTER(rte_dmadev_logtype, lib.dmadev, INFO);
> +
> +struct rte_dmadev rte_dmadevices[RTE_DMADEV_MAX_DEVS];
> +
> +static const char *MZ_RTE_DMADEV_DATA = "rte_dmadev_data";
> +/* Shared memory between primary and secondary processes. */
> +static struct {
> +	struct rte_dmadev_data data[RTE_DMADEV_MAX_DEVS];
> +} *dmadev_shared_data;
> +
> +static int
> +dmadev_check_name(const char *name)
> +{
> +	size_t name_len;
> +
> +	if (name == NULL) {
> +		RTE_DMADEV_LOG(ERR, "Name can't be NULL\n");
> +		return -EINVAL;
> +	}
> +
> +	name_len = strnlen(name, RTE_DMADEV_NAME_MAX_LEN);
> +	if (name_len == 0) {
> +		RTE_DMADEV_LOG(ERR, "Zero length DMA device name\n");
> +		return -EINVAL;
> +	}
> +	if (name_len >= RTE_DMADEV_NAME_MAX_LEN) {
> +		RTE_DMADEV_LOG(ERR, "DMA device name is too long\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static uint16_t
> +dmadev_find_free_dev(void)
> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
> +		if (dmadev_shared_data->data[i].dev_name[0] == '\0') {
> +			RTE_ASSERT(rte_dmadevices[i].attached == 0);
> +			return i;
> +		}
> +	}
> +
> +	return RTE_DMADEV_MAX_DEVS;
> +}
> +
> +static struct rte_dmadev*
> +dmadev_allocated(const char *name)

The name implies a boolean lookup for whether a particular dmadev has been
allocated or not. Since this returns a pointer, I think a name like
"dmadev_find" or "dmadev_get" would be more appropriate.

> +{
> +	uint16_t i;
> +
> +	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
> +		if ((rte_dmadevices[i].attached == 1) &&
> +		    (!strcmp(name, rte_dmadevices[i].data->dev_name)))
> +			return &rte_dmadevices[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static int
> +dmadev_shared_data_prepare(void)
> +{
> +	const struct rte_memzone *mz;
> +
> +	if (dmadev_shared_data == NULL) {
> +		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +			/* Allocate port data and ownership shared memory. */
> +			mz = rte_memzone_reserve(MZ_RTE_DMADEV_DATA,
> +					 sizeof(*dmadev_shared_data),
> +					 rte_socket_id(), 0);
> +		} else {
> +			mz = rte_memzone_lookup(MZ_RTE_DMADEV_DATA);
> +		}
> +		if (mz == NULL)
> +			return -ENOMEM;
> +
> +		dmadev_shared_data = mz->addr;
> +		if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +			memset(dmadev_shared_data->data, 0,
> +			       sizeof(dmadev_shared_data->data));
> +	}
> +
> +	return 0;
> +}
> +
> +static struct rte_dmadev *
> +dmadev_allocate(const char *name)
> +{
> +	struct rte_dmadev *dev;
> +	uint16_t dev_id;
> +
> +	dev = dmadev_allocated(name);
> +	if (dev != NULL) {
> +		RTE_DMADEV_LOG(ERR, "DMA device already allocated\n");
> +		return NULL;
> +	}
> +
> +	dev_id = dmadev_find_free_dev();
> +	if (dev_id == RTE_DMADEV_MAX_DEVS) {
> +		RTE_DMADEV_LOG(ERR, "Reached maximum number of DMA devices\n");
> +		return NULL;
> +	}
> +
> +	if (dmadev_shared_data_prepare() != 0) {
> +		RTE_DMADEV_LOG(ERR, "Cannot allocate DMA shared data\n");
> +		return NULL;
> +	}
> +
> +	dev = &rte_dmadevices[dev_id];
> +	dev->data = &dmadev_shared_data->data[dev_id];
> +	dev->data->dev_id = dev_id;
> +	strlcpy(dev->data->dev_name, name, sizeof(dev->data->dev_name));
> +
> +	return dev;
> +}
> +
> +static struct rte_dmadev *
> +dmadev_attach_secondary(const char *name)
> +{
> +	struct rte_dmadev *dev;
> +	uint16_t i;
> +
> +	if (dmadev_shared_data_prepare() != 0) {
> +		RTE_DMADEV_LOG(ERR, "Cannot allocate DMA shared data\n");
> +		return NULL;
> +	}
> +
> +	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
> +		if (!strcmp(dmadev_shared_data->data[i].dev_name, name))
> +			break;
> +	}
> +	if (i == RTE_DMADEV_MAX_DEVS) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %s is not driven by the primary process\n",
> +			name);
> +		return NULL;
> +	}
> +
> +	dev = &rte_dmadevices[i];
> +	dev->data = &dmadev_shared_data->data[i];
> +	RTE_ASSERT(dev->data->dev_id == i);
> +
> +	return dev;
> +}
> +
> +struct rte_dmadev *
> +rte_dmadev_pmd_allocate(const char *name)
> +{
> +	struct rte_dmadev *dev;
> +
> +	if (dmadev_check_name(name) != 0)
> +		return NULL;
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +		dev = dmadev_allocate(name);
> +	else
> +		dev = dmadev_attach_secondary(name);
> +
> +	if (dev == NULL)
> +		return NULL;
> +	dev->attached = 1;
> +
> +	return dev;
> +}
> +
> +int
> +rte_dmadev_pmd_release(struct rte_dmadev *dev)
> +{
> +	if (dev == NULL)
> +		return -EINVAL;
> +
> +	if (dev->attached == 0)
> +		return 0;
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		rte_free(dev->data->dev_private);
> +		memset(dev->data, 0, sizeof(struct rte_dmadev_data));
> +	}
> +
> +	memset(dev, 0, sizeof(struct rte_dmadev));
> +	dev->attached = 0;
> +
> +	return 0;
> +}
> +
> +struct rte_dmadev *
> +rte_dmadev_get_device_by_name(const char *name)
> +{
> +	if (dmadev_check_name(name) != 0)
> +		return NULL;
> +	return dmadev_allocated(name);
> +}
> +
> +bool
> +rte_dmadev_is_valid_dev(uint16_t dev_id)
> +{
> +	if (dev_id >= RTE_DMADEV_MAX_DEVS ||
> +	    rte_dmadevices[dev_id].attached == 0)
> +		return false;
> +	return true;
> +}
> +
> +uint16_t
> +rte_dmadev_count(void)
> +{
> +	uint16_t count = 0;
> +	uint16_t i;
> +
> +	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
> +		if (rte_dmadevices[i].attached == 1)
> +			count++;
> +	}
> +
> +	return count;
> +}
> +
> +int
> +rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info)
> +{
> +	struct rte_dmadev *dev;
> +	int ret;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(dev_info, -EINVAL);
> +
> +	dev = &rte_dmadevices[dev_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_info_get, -ENOTSUP);
> +	memset(dev_info, 0, sizeof(struct rte_dmadev_info));
> +	ret = (*dev->dev_ops->dev_info_get)(dev, dev_info);
> +	if (ret != 0)
> +		return ret;
> +
> +	dev_info->device = dev->device;
> +
> +	return 0;
> +}

Should the info_get function (and the related info structure), not include
in it the parameters passed into the configure function. That way, the user
can query a previously set up configuration. This should be done at the
dmadev level, rather than driver level, since I see the parameters are
already being saved in configure below.

Also, for ABI purposes, I would strongly suggest passing "sizeof(dev_info)"
to the driver in the "dev_info_get" call. When dev_info changes, we can
version rte_dmadev_info_get, but can't version the functions that it calls
in turn. When we add a new field to the struct, the driver functions that
choose to use that new field can check the size of the struct passed to
determine if it's safe to write that new field or not. [So long as field is
added at the end, driver functions not updated for the new field, need no
changes]

> +
> +int
> +rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf)
> +{
> +	struct rte_dmadev_info info;
> +	struct rte_dmadev *dev;
> +	int ret;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(dev_conf, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	ret = rte_dmadev_info_get(dev_id, &info);
> +	if (ret != 0) {
> +		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (dev_conf->max_vchans > info.max_vchans) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u configure too many vchans\n", dev_id);

We allow up to 100 characters per line for DPDK code, so these don't need
to be wrapped so aggressively.

> +		return -EINVAL;
> +	}
> +	if (dev_conf->enable_mt_vchan &&
> +	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MT_VCHAN)) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u don't support MT-safe vchan\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (dev_conf->enable_mt_multi_vchan &&
> +	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN)) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u don't support MT-safe multiple vchan\n",
> +			dev_id);
> +		return -EINVAL;
> +	}
> +
> +	if (dev->data->dev_started != 0) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u must be stopped to allow configuration\n",
> +			dev_id);
> +		return -EBUSY;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_configure, -ENOTSUP);
> +	ret = (*dev->dev_ops->dev_configure)(dev, dev_conf);
> +	if (ret == 0)
> +		memcpy(&dev->data->dev_conf, dev_conf, sizeof(*dev_conf));
> +
> +	return ret;
> +}
> +
> +int
> +rte_dmadev_start(uint16_t dev_id)
> +{
> +	struct rte_dmadev *dev;
> +	int ret;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	if (dev->data->dev_started != 0) {
> +		RTE_DMADEV_LOG(ERR, "Device %u already started\n", dev_id);

Maybe make this a warning rather than error.

> +		return 0;
> +	}
> +
> +	if (dev->dev_ops->dev_start == NULL)
> +		goto mark_started;
> +
> +	ret = (*dev->dev_ops->dev_start)(dev);
> +	if (ret != 0)
> +		return ret;
> +
> +mark_started:
> +	dev->data->dev_started = 1;
> +	return 0;
> +}
> +
> +int
> +rte_dmadev_stop(uint16_t dev_id)
> +{
> +	struct rte_dmadev *dev;
> +	int ret;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	if (dev->data->dev_started == 0) {
> +		RTE_DMADEV_LOG(ERR, "Device %u already stopped\n", dev_id);

As above, suggest just warning rather than error.

> +		return 0;
> +	}
> +
> +	if (dev->dev_ops->dev_stop == NULL)
> +		goto mark_stopped;
> +
> +	ret = (*dev->dev_ops->dev_stop)(dev);
> +	if (ret != 0)
> +		return ret;
> +
> +mark_stopped:
> +	dev->data->dev_started = 0;
> +	return 0;
> +}
> +
> +int
> +rte_dmadev_close(uint16_t dev_id)
> +{
> +	struct rte_dmadev *dev;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	/* Device must be stopped before it can be closed */
> +	if (dev->data->dev_started == 1) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u must be stopped before closing\n", dev_id);
> +		return -EBUSY;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
> +	return (*dev->dev_ops->dev_close)(dev);
> +}
> +
> +int
> +rte_dmadev_reset(uint16_t dev_id)
> +{
> +	struct rte_dmadev *dev;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_reset, -ENOTSUP);
> +	/* Reset is not dependent on state of the device */
> +	return (*dev->dev_ops->dev_reset)(dev);
> +}

I would tend to agree with the query as to whether this is needed or not.
Can we perhaps remove for now, and add it back later if it does prove to be
needed. The less code to review and work with for the first version, the
better IMHO. :-)

> +
> +int
> +rte_dmadev_vchan_setup(uint16_t dev_id,
> +		       const struct rte_dmadev_vchan_conf *conf)
> +{
> +	struct rte_dmadev_info info;
> +	struct rte_dmadev *dev;
> +	int ret;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(conf, -EINVAL);

This is confusing, because you are actually doing a parameter check using a
macro named for checking a function. Better to explicitly just check conf
for null.

> +
> +	dev = &rte_dmadevices[dev_id];
> +
> +	ret = rte_dmadev_info_get(dev_id, &info);
> +	if (ret != 0) {
> +		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (conf->direction == 0 ||
> +	    conf->direction & ~RTE_DMA_TRANSFER_DIR_ALL) {
> +		RTE_DMADEV_LOG(ERR, "Device %u direction invalid!\n", dev_id);
> +		return -EINVAL;
> +	}

I wonder should we allow direction == 0, to be the same as all bits set,
or to be all supported bits set?

> +	if (conf->direction & RTE_DMA_MEM_TO_MEM &&
> +	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MEM_TO_MEM)) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u don't support mem2mem transfer\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (conf->direction & RTE_DMA_MEM_TO_DEV &&
> +	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MEM_TO_DEV)) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u don't support mem2dev transfer\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (conf->direction & RTE_DMA_DEV_TO_MEM &&
> +	    !(info.dev_capa & RTE_DMA_DEV_CAPA_DEV_TO_MEM)) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u don't support dev2mem transfer\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (conf->direction & RTE_DMA_DEV_TO_DEV &&
> +	    !(info.dev_capa & RTE_DMA_DEV_CAPA_DEV_TO_DEV)) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u don't support dev2dev transfer\n", dev_id);
> +		return -EINVAL;
> +	}
> +	if (conf->nb_desc < info.min_desc || conf->nb_desc > info.max_desc) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u number of descriptors invalid\n", dev_id);
> +		return -EINVAL;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vchan_setup, -ENOTSUP);
> +	return (*dev->dev_ops->vchan_setup)(dev, conf);
> +}
> +
> +int
> +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan)
> +{
> +	struct rte_dmadev *dev;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u vchan %u out of range\n", dev_id, vchan);
> +		return -EINVAL;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vchan_release, -ENOTSUP);
> +	return (*dev->dev_ops->vchan_release)(dev, vchan);
> +}
> +
> +int
> +rte_dmadev_stats_get(uint16_t dev_id, int vchan, struct rte_dmadev_stats *stats)
> +{
> +	struct rte_dmadev *dev;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(stats, -EINVAL);
> +
> +	dev = &rte_dmadevices[dev_id];
> +
> +	if (vchan < -1 || vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u vchan %u out of range\n", dev_id, vchan);
> +		return -EINVAL;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
> +	return (*dev->dev_ops->stats_get)(dev, vchan, stats);
> +}
> +
> +int
> +rte_dmadev_stats_reset(uint16_t dev_id, int vchan)
> +{
> +	struct rte_dmadev *dev;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	if (vchan < -1 || vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR,
> +			"Device %u vchan %u out of range\n", dev_id, vchan);
> +		return -EINVAL;
> +	}
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP);
> +	return (*dev->dev_ops->stats_reset)(dev, vchan);
> +}
> +
> +int
> +rte_dmadev_dump(uint16_t dev_id, FILE *f)
> +{
> +	struct rte_dmadev_info info;
> +	struct rte_dmadev *dev;
> +	int ret;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(f, -EINVAL);
> +
> +	ret = rte_dmadev_info_get(dev_id, &info);
> +	if (ret != 0) {
> +		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_dmadevices[dev_id];
> +
> +	fprintf(f, "DMA Dev %u, '%s' [%s]\n",
> +		dev->data->dev_id,
> +		dev->data->dev_name,
> +		dev->data->dev_started ? "started" : "stopped");
> +	fprintf(f, "  dev_capa: 0x%" PRIx64 "\n", info.dev_capa);
> +	fprintf(f, "  max_vchans_supported: %u\n", info.max_vchans);
> +	fprintf(f, "  max_vchans_configured: %u\n", info.nb_vchans);
> +	fprintf(f, "  MT-safe-configured: vchans: %u multi-vchans: %u\n",
> +		dev->data->dev_conf.enable_mt_vchan,
> +		dev->data->dev_conf.enable_mt_multi_vchan);
> +
> +	if (dev->dev_ops->dev_dump != NULL)
> +		return (*dev->dev_ops->dev_dump)(dev, f);
> +
> +	return 0;
> +}
> +
> +int
> +rte_dmadev_selftest(uint16_t dev_id)
> +{
> +	struct rte_dmadev *dev;
> +
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	dev = &rte_dmadevices[dev_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_selftest, -ENOTSUP);
> +	return (*dev->dev_ops->dev_selftest)(dev_id);
> +}

Bruce Richardson July 12, 2021, 1:32 p.m. UTC | #8

On Mon, Jul 12, 2021 at 03:29:27PM +0530, Jerin Jacob wrote:
> On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> >
> > This patch introduce 'dmadevice' which is a generic type of DMA
> > device.
> >
> > The APIs of dmadev library exposes some generic operations which can
> > enable configuration and I/O with the DMA devices.
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > ---
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a fill operation onto the virtual DMA channel.
> > + *
> > + * This queues up a fill operation to be performed by hardware, but does not
> > + * trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param pattern
> > + *   The pattern to populate the destination buffer with.
> > + * @param dst
> > + *   The address of the destination buffer.
> > + * @param length
> > + *   The length of the destination buffer.
> > + * @param flags
> > + *   An flags for this operation.
> > + *
> > + * @return
> > + *   - 0..UINT16_MAX: index of enqueued copy job.
> 
> fill job
> 
> > + *   - <0: Error code returned by the driver copy function.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> > +               rte_iova_t dst, uint32_t length, uint64_t flags)
> > +{
> > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> > +       if (vchan >= dev->data->dev_conf.max_vchans) {
> > +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> > +}
> > +
> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Returns the number of operations that have been successfully completed.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param nb_cpls
> > + *   The maximum number of completed operations that can be processed.
> > + * @param[out] last_idx
> > + *   The last completed operation's index.
> > + *   If not required, NULL can be passed in.
> 
> This means the driver will be tracking the last index.
> 

Driver will be doing this anyway, no, since it needs to ensure we don't
wrap around?

> Is that mean, the application needs to call this API periodically to
> consume the completion slot.
> I.e up to 64K (UINT16_MAX)  outstanding jobs are possible. If the
> application fails to call this
> >64K outstand job then the subsequence enqueue will fail.

Well, given that there will be a regular enqueue ring which will probably
be <= 64k in size, the completion call will need to be called frequently
anyway. I don't think we need to document this restriction as it's fairly
understood that you can't go beyond the size of the ring without cleanup.

> 
> If so, we need to document this.
> 
> One of the concerns of keeping UINT16_MAX as the limit is the
> completion memory will always not in cache.
> On the other hand, if we make this size programmable. it may introduce
> complexity in the application.
> 
> Thoughts?

The reason for using powers-of-2 sizes, e.g. 0 .. UINT16_MAX, is that the
ring can be any other power-of-2 size and we can index it just by masking.
In the sample app for dmadev, I expect the ring size used to be set the
same as the dmadev enqueue ring size, for simplicity.

In fact, I was thinking that in later versions we may be able to include
some macros to help make this whole process easier, of converting indexes
to arbitrary data structures. [The reason for using macros is so that the
actual rings we are indexing can be of user-defined type, rather than just
a ring of pointers].

/Bruce

Bruce Richardson July 12, 2021, 3:50 p.m. UTC | #9

On Sun, Jul 11, 2021 at 05:25:56PM +0800, Chengwen Feng wrote:
> This patch introduce 'dmadevice' which is a generic type of DMA
> device.
> 
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

Hi again,

some further review comments inline.

/Bruce

> ---
>  MAINTAINERS                  |    4 +
>  config/rte_config.h          |    3 +
>  lib/dmadev/meson.build       |    6 +
>  lib/dmadev/rte_dmadev.c      |  560 +++++++++++++++++++++++
>  lib/dmadev/rte_dmadev.h      | 1030 ++++++++++++++++++++++++++++++++++++++++++
>  lib/dmadev/rte_dmadev_core.h |  159 +++++++
>  lib/dmadev/rte_dmadev_pmd.h  |   72 +++
>  lib/dmadev/version.map       |   40 ++
>  lib/meson.build              |    1 +

<snip>

> diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
> new file mode 100644
> index 0000000..8779512
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev.h
> @@ -0,0 +1,1030 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + * Copyright(c) 2021 Intel Corporation.
> + * Copyright(c) 2021 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_DMADEV_H_
> +#define _RTE_DMADEV_H_
> +
> +/**
> + * @file rte_dmadev.h
> + *
> + * RTE DMA (Direct Memory Access) device APIs.
> + *
> + * The DMA framework is built on the following model:
> + *
> + *     ---------------   ---------------       ---------------
> + *     | virtual DMA |   | virtual DMA |       | virtual DMA |
> + *     | channel     |   | channel     |       | channel     |
> + *     ---------------   ---------------       ---------------
> + *            |                |                      |
> + *            ------------------                      |
> + *                     |                              |
> + *               ------------                    ------------
> + *               |  dmadev  |                    |  dmadev  |
> + *               ------------                    ------------
> + *                     |                              |
> + *            ------------------               ------------------
> + *            | HW-DMA-channel |               | HW-DMA-channel |
> + *            ------------------               ------------------
> + *                     |                              |
> + *                     --------------------------------
> + *                                     |
> + *                           ---------------------
> + *                           | HW-DMA-Controller |
> + *                           ---------------------
> + *
> + * The DMA controller could have multilpe HW-DMA-channels (aka. HW-DMA-queues),
> + * each HW-DMA-channel should be represented by a dmadev.
> + *
> + * The dmadev could create multiple virtual DMA channel, each virtual DMA
> + * channel represents a different transfer context. The DMA operation request
> + * must be submitted to the virtual DMA channel.
> + * E.G. Application could create virtual DMA channel 0 for mem-to-mem transfer
> + *      scenario, and create virtual DMA channel 1 for mem-to-dev transfer
> + *      scenario.
> + *
> + * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the
> + * PCI/SoC device probing phase performed at EAL initialization time. And could
> + * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing
> + * phase.
> + *
> + * We use 'uint16_t dev_id' as the device identifier of a dmadev, and
> + * 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev.
> + *
> + * The functions exported by the dmadev API to setup a device designated by its
> + * device identifier must be invoked in the following order:
> + *     - rte_dmadev_configure()
> + *     - rte_dmadev_vchan_setup()
> + *     - rte_dmadev_start()
> + *
> + * Then, the application can invoke dataplane APIs to process jobs.
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_dmadev_configure()), it must call rte_dmadev_stop() first to stop the
> + * device and then do the reconfiguration before calling rte_dmadev_start()
> + * again. The dataplane APIs should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a dmadev by invoking the
> + * rte_dmadev_close() function.
> + *
> + * The dataplane APIs include two parts:
> + *   a) The first part is the submission of operation requests:
> + *        - rte_dmadev_copy()
> + *        - rte_dmadev_copy_sg() - scatter-gather form of copy
> + *        - rte_dmadev_fill()
> + *        - rte_dmadev_fill_sg() - scatter-gather form of fill
> + *        - rte_dmadev_perform() - issue doorbell to hardware
> + *      These APIs could work with different virtual DMA channels which have
> + *      different contexts.
> + *      The first four APIs are used to submit the operation request to the
> + *      virtual DMA channel, if the submission is successful, a uint16_t
> + *      ring_idx is returned, otherwise a negative number is returned.
> + *   b) The second part is to obtain the result of requests:
> + *        - rte_dmadev_completed()
> + *            - return the number of operation requests completed successfully.
> + *        - rte_dmadev_completed_fails()
> + *            - return the number of operation requests failed to complete.

Please rename this to "completed_status" to allow the return of information
other than just errors. As I suggested before, I think this should also be
usable as a slower version of "completed" even in the case where there are
no errors, in that it returns status information for each and every job
rather than just returning as soon as it hits a failure.

> + * + * About the ring_idx which rte_dmadev_copy/copy_sg/fill/fill_sg()
> returned, + * the rules are as follows: + *   a) ring_idx for each
> virtual DMA channel are independent.  + *   b) For a virtual DMA channel,
> the ring_idx is monotonically incremented, + *      when it reach
> UINT16_MAX, it wraps back to zero.

Based on other feedback, I suggest we put in the detail here that: "This
index can be used by applications to track per-job metadata in an
application-defined circular ring, where the ring is a power-of-2 size, and
the indexes are masked appropriately."

> + *   c) The initial ring_idx of a virtual DMA channel is zero, after the device
> + *      is stopped or reset, the ring_idx needs to be reset to zero.
> + *   Example:
> + *      step-1: start one dmadev
> + *      step-2: enqueue a copy operation, the ring_idx return is 0
> + *      step-3: enqueue a copy operation again, the ring_idx return is 1
> + *      ...
> + *      step-101: stop the dmadev
> + *      step-102: start the dmadev
> + *      step-103: enqueue a copy operation, the cookie return is 0
> + *      ...
> + *      step-x+0: enqueue a fill operation, the ring_idx return is 65535
> + *      step-x+1: enqueue a copy operation, the ring_idx return is 0
> + *      ...
> + *
> + * By default, all the non-dataplane functions of the dmadev API exported by a
> + * PMD are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object.
> + *
> + * The dataplane functions of the dmadev API exported by a PMD can be MT-safe
> + * only when supported by the driver, generally, the driver will reports two
> + * capabilities:
> + *   a) Whether to support MT-safe for the submit/completion API of the same
> + *      virtual DMA channel.
> + *      E.G. one thread do submit operation, another thread do completion
> + *           operation.
> + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VCHAN.
> + *      If driver don't support it, it's up to the application to guarantee
> + *      MT-safe.
> + *   b) Whether to support MT-safe for different virtual DMA channels.
> + *      E.G. one thread do operation on virtual DMA channel 0, another thread
> + *           do operation on virtual DMA channel 1.
> + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
> + *      If driver don't support it, it's up to the application to guarantee
> + *      MT-safe.
> + *
> + */

Just to check - do we have hardware that currently supports these
capabilities? For Intel HW, we will only support one virtual channel per
device without any MT-safety guarantees, so won't be setting either of
these flags. If any of these flags are unused in all planned drivers, we
should drop them from the spec until they prove necessary. Idealy,
everything in the dmadev definition should be testable, and features unused
by anyone obviously will be untested.

> +
> +#include <rte_common.h>
> +#include <rte_compat.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#define RTE_DMADEV_NAME_MAX_LEN	RTE_DEV_NAME_MAX_LEN
> +
> +extern int rte_dmadev_logtype;
> +
> +#define RTE_DMADEV_LOG(level, ...) \
> +	rte_log(RTE_LOG_ ## level, rte_dmadev_logtype, "" __VA_ARGS__)
> +
> +/* Macros to check for valid port */
> +#define RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, retval) do { \
> +	if (!rte_dmadev_is_valid_dev(dev_id)) { \
> +		RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
> +		return retval; \
> +	} \
> +} while (0)
> +
> +#define RTE_DMADEV_VALID_DEV_ID_OR_RET(dev_id) do { \
> +	if (!rte_dmadev_is_valid_dev(dev_id)) { \
> +		RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
> +		return; \
> +	} \
> +} while (0)
> +

Can we avoid using these in the inline functions in this file, and move
them to the _pmd.h which is for internal PMD use only? It would mean we
don't get logging from the key dataplane functions, but I would hope the
return values would provide enough info.

Alternatively, can we keep the logtype definition and first macro and move
the other two to the _pmd.h file.

> +/**
> + * @internal
> + * Validate if the DMA device index is a valid attached DMA device.
> + *
> + * @param dev_id
> + *   DMA device index.
> + *
> + * @return
> + *   - If the device index is valid (true) or not (false).
> + */
> +__rte_internal
> +bool
> +rte_dmadev_is_valid_dev(uint16_t dev_id);
> +
> +/**
> + * rte_dma_sg - can hold scatter DMA operation request
> + */
> +struct rte_dma_sg {
> +	rte_iova_t src;
> +	rte_iova_t dst;
> +	uint32_t length;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the total number of DMA devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable DMA devices.
> + */
> +__rte_experimental
> +uint16_t
> +rte_dmadev_count(void);
> +
> +/**
> + * The capabilities of a DMA device
> + */
> +#define RTE_DMA_DEV_CAPA_MEM_TO_MEM	(1ull << 0)
> +/**< DMA device support mem-to-mem transfer.

Do we need this? Can we assume that any device appearing as a dmadev can
do mem-to-mem copies, and drop the capability for mem-to-mem and the
capability for copying?

> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_MEM_TO_DEV	(1ull << 1)
> +/**< DMA device support slave mode & mem-to-dev transfer.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_DEV_TO_MEM	(1ull << 2)
> +/**< DMA device support slave mode & dev-to-mem transfer.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_DEV_TO_DEV	(1ull << 3)
> +/**< DMA device support slave mode & dev-to-dev transfer.
> + *

Just to confirm, are there devices currently planned for dmadev that
supports only a subset of these flags? Thinking particularly of the
dev-2-mem and mem-2-dev ones here - do any of the devices we are
considering not support using device memory?
[Again, just want to ensure we aren't adding too much stuff that we don't
need yet]

> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_OPS_COPY	(1ull << 4)
> +/**< DMA device support copy ops.
> + *

Suggest dropping this and making it min for dmadev.

> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_OPS_FILL	(1ull << 5)
> +/**< DMA device support fill ops.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_OPS_SG		(1ull << 6)
> +/**< DMA device support scatter-list ops.
> + * If device support ops_copy and ops_sg, it means supporting copy_sg ops.
> + * If device support ops_fill and ops_sg, it means supporting fill_sg ops.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_FENCE		(1ull << 7)
> +/**< DMA device support fence.
> + * If device support fence, then application could set a fence flags when
> + * enqueue operation by rte_dma_copy/copy_sg/fill/fill_sg.
> + * If a operation has a fence flags, it means the operation must be processed
> + * only after all previous operations are completed.
> + *

Is this needed? As I understand it, the Marvell driver doesn't require
fences so providing one is a no-op. Therefore, this flag is probably
unnecessary.

> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_SVA		(1ull << 8)
> +/**< DMA device support SVA which could use VA as DMA address.
> + * If device support SVA then application could pass any VA address like memory
> + * from rte_malloc(), rte_memzone(), malloc, stack memory.
> + * If device don't support SVA, then application should pass IOVA address which
> + * from rte_malloc(), rte_memzone().
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_MT_VCHAN	(1ull << 9)
> +/**< DMA device support MT-safe of a virtual DMA channel.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */
> +#define RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN	(1ull << 10)
> +/**< DMA device support MT-safe of different virtual DMA channels.
> + *
> + * @see struct rte_dmadev_info::dev_capa
> + */

As with comments above - let's check that these will actually be used
before we add them.

> +
> +/**
> + * A structure used to retrieve the contextual information of
> + * an DMA device
> + */
> +struct rte_dmadev_info {
> +	struct rte_device *device; /**< Generic Device information */
> +	uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
> +	/** Maximum number of virtual DMA channels supported */
> +	uint16_t max_vchans;
> +	/** Maximum allowed number of virtual DMA channel descriptors */
> +	uint16_t max_desc;
> +	/** Minimum allowed number of virtual DMA channel descriptors */
> +	uint16_t min_desc;
> +	uint16_t nb_vchans; /**< Number of virtual DMA channel configured */
> +};

Let's add rte_dmadev_conf struct into this to return the configuration
settings.

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Retrieve the contextual information of a DMA device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_dmadev_info* to be filled with the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - =0: Success, driver updates the contextual information of the DMA device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +__rte_experimental
> +int
> +rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info);
> +

Should have "const" on second param.

> +/**
> + * A structure used to configure a DMA device.
> + */
> +struct rte_dmadev_conf {
> +	/** Maximum number of virtual DMA channel to use.
> +	 * This value cannot be greater than the field 'max_vchans' of struct
> +	 * rte_dmadev_info which get from rte_dmadev_info_get().
> +	 */
> +	uint16_t max_vchans;
> +	/** Enable bit for MT-safe of a virtual DMA channel.
> +	 * This bit can be enabled only when the device supports
> +	 * RTE_DMA_DEV_CAPA_MT_VCHAN.
> +	 * @see RTE_DMA_DEV_CAPA_MT_VCHAN
> +	 */
> +	uint8_t enable_mt_vchan : 1;
> +	/** Enable bit for MT-safe of different virtual DMA channels.
> +	 * This bit can be enabled only when the device supports
> +	 * RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
> +	 * @see RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN
> +	 */
> +	uint8_t enable_mt_multi_vchan : 1;
> +	uint64_t reserved[2]; /**< Reserved for future fields */
> +};

Drop the reserved fields. ABI versioning is a better way to deal with
adding new fields.

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Configure a DMA device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param dev_conf
> + *   The DMA device configuration structure encapsulated into rte_dmadev_conf
> + *   object.
> + *
> + * @return
> + *   - =0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Start a DMA device.
> + *
> + * The device start step is the last one and consists of setting the DMA
> + * to start accepting jobs.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - =0: Success, device started.
> + *   - <0: Error code returned by the driver start function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_start(uint16_t dev_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Stop a DMA device.
> + *
> + * The device can be restarted with a call to rte_dmadev_start()
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - =0: Success, device stopped.
> + *   - <0: Error code returned by the driver stop function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_stop(uint16_t dev_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Close a DMA device.
> + *
> + * The device cannot be restarted after this call.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *  - =0: Successfully close device
> + *  - <0: Failure to close device
> + */
> +__rte_experimental
> +int
> +rte_dmadev_close(uint16_t dev_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Reset a DMA device.
> + *
> + * This is different from cycle of rte_dmadev_start->rte_dmadev_stop in the
> + * sense similar to hard or soft reset.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - =0: Successfully reset device.
> + *   - <0: Failure to reset device.
> + *   - (-ENOTSUP): If the device doesn't support this function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_reset(uint16_t dev_id);
> +
> +/**
> + * DMA transfer direction defines.
> + */
> +#define RTE_DMA_MEM_TO_MEM	(1ull << 0)
> +/**< DMA transfer direction - from memory to memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_MEM_TO_DEV	(1ull << 1)
> +/**< DMA transfer direction - slave mode & from memory to device.
> + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> + * request from ARM memory to x86 host memory.

For clarity, it would be good to specify in the scenario described which
memory is the "mem" and which is the "dev" (I assume SoC memory is "mem"
and x86 host memory is "dev"??)

> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_DEV_TO_MEM	(1ull << 2)
> +/**< DMA transfer direction - slave mode & from device to memory.
> + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> + * request from x86 host memory to ARM memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_DEV_TO_DEV	(1ull << 3)
> +/**< DMA transfer direction - slave mode & from device to device.
> + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> + * request from x86 host memory to another x86 host memory.
> + *
> + * @see struct rte_dmadev_vchan_conf::direction
> + */
> +#define RTE_DMA_TRANSFER_DIR_ALL	(RTE_DMA_MEM_TO_MEM | \
> +					 RTE_DMA_MEM_TO_DEV | \
> +					 RTE_DMA_DEV_TO_MEM | \
> +					 RTE_DMA_DEV_TO_DEV)
> +
> +/**
> + * enum rte_dma_slave_port_type - slave mode type defines
> + */
> +enum rte_dma_slave_port_type {
> +	/** The slave port is PCIE. */
> +	RTE_DMA_SLAVE_PORT_PCIE = 1,
> +};
> +

As previously mentioned, this needs to be updated to use other terms.
For some suggested alternatives see:
https://doc.dpdk.org/guides-21.05/contributing/coding_style.html#naming

> +/**
> + * A structure used to descript slave port parameters.
> + */
> +struct rte_dma_slave_port_parameters {
> +	enum rte_dma_slave_port_type port_type;
> +	union {
> +		/** For PCIE port */
> +		struct {
> +			/** The physical function number which to use */
> +			uint64_t pf_number : 6;
> +			/** Virtual function enable bit */
> +			uint64_t vf_enable : 1;
> +			/** The virtual function number which to use */
> +			uint64_t vf_number : 8;
> +			uint64_t pasid : 20;
> +			/** The attributes filed in TLP packet */
> +			uint64_t tlp_attr : 3;
> +		};
> +	};
> +};
> +
> +/**
> + * A structure used to configure a virtual DMA channel.
> + */
> +struct rte_dmadev_vchan_conf {
> +	uint8_t direction; /**< Set of supported transfer directions */
> +	/** Number of descriptor for the virtual DMA channel */
> +	uint16_t nb_desc;
> +	/** 1) Used to describes the dev parameter in the mem-to-dev/dev-to-mem
> +	 * transfer scenario.
> +	 * 2) Used to describes the src dev parameter in the dev-to-dev
> +	 * transfer scenario.
> +	 */
> +	struct rte_dma_slave_port_parameters port;
> +	/** Used to describes the dst dev parameters in the dev-to-dev
> +	 * transfer scenario.
> +	 */
> +	struct rte_dma_slave_port_parameters peer_port;
> +	uint64_t reserved[2]; /**< Reserved for future fields */
> +};

Let's drop the reserved fields and use ABI versioning if necesssary in
future.

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Allocate and set up a virtual DMA channel.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param conf
> + *   The virtual DMA channel configuration structure encapsulated into
> + *   rte_dmadev_vchan_conf object.
> + *
> + * @return
> + *   - >=0: Allocate success, it is the virtual DMA channel id. This value must
> + *          be less than the field 'max_vchans' of struct rte_dmadev_conf
> +	    which configured by rte_dmadev_configure().

nit: whitespace error here.

> + *   - <0: Error code returned by the driver virtual channel setup function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_vchan_setup(uint16_t dev_id,
> +		       const struct rte_dmadev_vchan_conf *conf);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Release a virtual DMA channel.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel which return by vchan setup.
> + *
> + * @return
> + *   - =0: Successfully release the virtual DMA channel.
> + *   - <0: Error code returned by the driver virtual channel release function.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
> +
> +/**
> + * rte_dmadev_stats - running statistics.
> + */
> +struct rte_dmadev_stats {
> +	/** Count of operations which were successfully enqueued */
> +	uint64_t enqueued_count;
> +	/** Count of operations which were submitted to hardware */
> +	uint64_t submitted_count;
> +	/** Count of operations which failed to complete */
> +	uint64_t completed_fail_count;
> +	/** Count of operations which successfully complete */
> +	uint64_t completed_count;
> +	uint64_t reserved[4]; /**< Reserved for future fields */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Retrieve basic statistics of a or all virtual DMA channel(s).
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel, -1 means all channels.
> + * @param[out] stats
> + *   The basic statistics structure encapsulated into rte_dmadev_stats
> + *   object.
> + *
> + * @return
> + *   - =0: Successfully retrieve stats.
> + *   - <0: Failure to retrieve stats.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_stats_get(uint16_t dev_id, int vchan,

vchan as uint16_t rather than int, I think. This would apply to all
dataplane functions. There is no need for a signed vchan value.

> +		     struct rte_dmadev_stats *stats);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Reset basic statistics of a or all virtual DMA channel(s).
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel, -1 means all channels.
> + *
> + * @return
> + *   - =0: Successfully reset stats.
> + *   - <0: Failure to reset stats.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_stats_reset(uint16_t dev_id, int vchan);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Dump DMA device info.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param f
> + *   The file to write the output to.
> + *
> + * @return
> + *   0 on success. Non-zero otherwise.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_dump(uint16_t dev_id, FILE *f);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Trigger the dmadev self test.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @return
> + *   - 0: Selftest successful.
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +__rte_experimental
> +int
> +rte_dmadev_selftest(uint16_t dev_id);

I don't think this needs to be in the public API, since it should only be
for the autotest app to use. Maybe move the prototype to the _pmd.h (since
we don't have a separate internal header), and then the autotest app can
pick it up from there.

> +
> +#include "rte_dmadev_core.h"
> +
> +/**
> + *  DMA flags to augment operation preparation.
> + *  Used as the 'flags' parameter of rte_dmadev_copy/copy_sg/fill/fill_sg.
> + */
> +#define RTE_DMA_FLAG_FENCE	(1ull << 0)
> +/**< DMA fence flag
> + * It means the operation with this flag must be processed only after all
> + * previous operations are completed.
> + *
> + * @see rte_dmadev_copy()
> + * @see rte_dmadev_copy_sg()
> + * @see rte_dmadev_fill()
> + * @see rte_dmadev_fill_sg()
> + */

As a general comment, I think all these multi-line comments should go
before the item they describe. Comments after should only be used in the
case where the comment fits on the rest of the line after a value.

We also should define the SUBMIT flag as suggested by Jerin, to allow apps
to automatically submit jobs after enqueue.

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a copy operation onto the virtual DMA channel.
> + *
> + * This queues up a copy operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param src
> + *   The address of the source buffer.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the data to be copied.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
> +		uint32_t length, uint64_t flags)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->copy)(dev, vchan, src, dst, length, flags);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a scatter list copy operation onto the virtual DMA channel.
> + *
> + * This queues up a scatter list copy operation to be performed by hardware,
> + * but does not trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param sg
> + *   The pointer of scatterlist.
> + * @param sg_len
> + *   The number of scatterlist elements.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
> +		   uint32_t sg_len, uint64_t flags)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(sg, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy_sg, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->copy_sg)(dev, vchan, sg, sg_len, flags);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a fill operation onto the virtual DMA channel.
> + *
> + * This queues up a fill operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param pattern
> + *   The pattern to populate the destination buffer with.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the destination buffer.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> +		rte_iova_t dst, uint32_t length, uint64_t flags)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a scatter list fill operation onto the virtual DMA channel.
> + *
> + * This queues up a scatter list fill operation to be performed by hardware,
> + * but does not trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param pattern
> + *   The pattern to populate the destination buffer with.
> + * @param sg
> + *   The pointer of scatterlist.
> + * @param sg_len
> + *   The number of scatterlist elements.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_fill_sg(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> +		   const struct rte_dma_sg *sg, uint32_t sg_len,
> +		   uint64_t flags)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(sg, -ENOTSUP);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->fill_sg)(dev, vchan, pattern, sg, sg_len, flags);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Trigger hardware to begin performing enqueued operations.
> + *
> + * This API is used to write the "doorbell" to the hardware to trigger it
> + * to begin the operations previously enqueued by rte_dmadev_copy/fill()
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + *
> + * @return
> + *   - =0: Successfully trigger hardware.
> + *   - <0: Failure to trigger hardware.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_submit(uint16_t dev_id, uint16_t vchan)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->submit, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->submit)(dev, vchan);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that have been successfully completed.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param nb_cpls
> + *   The maximum number of completed operations that can be processed.
> + * @param[out] last_idx
> + *   The last completed operation's index.
> + *   If not required, NULL can be passed in.
> + * @param[out] has_error
> + *   Indicates if there are transfer error.
> + *   If not required, NULL can be passed in.
> + *
> + * @return
> + *   The number of operations that successfully completed.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
> +		     uint16_t *last_idx, bool *has_error)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +	uint16_t idx;
> +	bool err;
> +
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->completed, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +	if (nb_cpls == 0) {
> +		RTE_DMADEV_LOG(ERR, "Invalid nb_cpls\n");
> +		return -EINVAL;
> +	}
> +#endif
> +
> +	/* Ensure the pointer values are non-null to simplify drivers.
> +	 * In most cases these should be compile time evaluated, since this is
> +	 * an inline function.
> +	 * - If NULL is explicitly passed as parameter, then compiler knows the
> +	 *   value is NULL
> +	 * - If address of local variable is passed as parameter, then compiler
> +	 *   can know it's non-NULL.
> +	 */
> +	if (last_idx == NULL)
> +		last_idx = &idx;
> +	if (has_error == NULL)
> +		has_error = &err;
> +
> +	*has_error = false;
> +	return (*dev->completed)(dev, vchan, nb_cpls, last_idx, has_error);
> +}
> +
> +/**
> + * DMA transfer status code defines
> + */
> +enum rte_dma_status_code {
> +	/** The operation completed successfully */
> +	RTE_DMA_STATUS_SUCCESSFUL = 0,
> +	/** The operation failed to complete due active drop
> +	 * This is mainly used when processing dev_stop, allow outstanding
> +	 * requests to be completed as much as possible.
> +	 */
> +	RTE_DMA_STATUS_ACTIVE_DROP,
> +	/** The operation failed to complete due invalid source address */
> +	RTE_DMA_STATUS_INVALID_SRC_ADDR,
> +	/** The operation failed to complete due invalid destination address */
> +	RTE_DMA_STATUS_INVALID_DST_ADDR,
> +	/** The operation failed to complete due invalid length */
> +	RTE_DMA_STATUS_INVALID_LENGTH,
> +	/** The operation failed to complete due invalid opcode
> +	 * The DMA descriptor could have multiple format, which are
> +	 * distinguished by the opcode field.
> +	 */
> +	RTE_DMA_STATUS_INVALID_OPCODE,
> +	/** The operation failed to complete due bus err */
> +	RTE_DMA_STATUS_BUS_ERROR,
> +	/** The operation failed to complete due data poison */
> +	RTE_DMA_STATUS_DATA_POISION,
> +	/** The operation failed to complete due descriptor read error */
> +	RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR,
> +	/** The operation failed to complete due device link error
> +	 * Used to indicates that the link error in the mem-to-dev/dev-to-mem/
> +	 * dev-to-dev transfer scenario.
> +	 */
> +	RTE_DMA_STATUS_DEV_LINK_ERROR,
> +	/** Driver specific status code offset
> +	 * Start status code for the driver to define its own error code.
> +	 */
> +	RTE_DMA_STATUS_DRV_SPECIFIC_OFFSET = 0x10000,
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that failed to complete.
> + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param nb_status
> + *   Indicates the size of status array.
> + * @param[out] status
> + *   The error code of operations that failed to complete.
> + *   Some standard error code are described in 'enum rte_dma_status_code'
> + *   @see rte_dma_status_code
> + * @param[out] last_idx
> + *   The last failed completed operation's index.
> + *
> + * @return
> + *   The number of operations that failed to complete.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vchan,
> +			   const uint16_t nb_status, uint32_t *status,
> +			   uint16_t *last_idx)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(status, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(last_idx, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_fails, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +	if (nb_status == 0) {
> +		RTE_DMADEV_LOG(ERR, "Invalid nb_status\n");
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->completed_fails)(dev, vchan, nb_status, status, last_idx);
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_DMADEV_H_ */
> diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
> new file mode 100644
> index 0000000..410faf0
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev_core.h
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + * Copyright(c) 2021 Intel Corporation.
> + */
> +
> +#ifndef _RTE_DMADEV_CORE_H_
> +#define _RTE_DMADEV_CORE_H_
> +
> +/**
> + * @file
> + *
> + * RTE DMA Device internal header.
> + *
> + * This header contains internal data types, that are used by the DMA devices
> + * in order to expose their ops to the class.
> + *
> + * Applications should not use these API directly.
> + *
> + */
> +
> +struct rte_dmadev;
> +
> +/** @internal Used to get device information of a device. */
> +typedef int (*dmadev_info_get_t)(struct rte_dmadev *dev,
> +				 struct rte_dmadev_info *dev_info);

First parameter can be "const"

> +/** @internal Used to configure a device. */
> +typedef int (*dmadev_configure_t)(struct rte_dmadev *dev,
> +				  const struct rte_dmadev_conf *dev_conf);
> +
> +/** @internal Used to start a configured device. */
> +typedef int (*dmadev_start_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to stop a configured device. */
> +typedef int (*dmadev_stop_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to close a configured device. */
> +typedef int (*dmadev_close_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to reset a configured device. */
> +typedef int (*dmadev_reset_t)(struct rte_dmadev *dev);
> +
> +/** @internal Used to allocate and set up a virtual DMA channel. */
> +typedef int (*dmadev_vchan_setup_t)(struct rte_dmadev *dev,
> +				    const struct rte_dmadev_vchan_conf *conf);
> +
> +/** @internal Used to release a virtual DMA channel. */
> +typedef int (*dmadev_vchan_release_t)(struct rte_dmadev *dev, uint16_t vchan);
> +
> +/** @internal Used to retrieve basic statistics. */
> +typedef int (*dmadev_stats_get_t)(struct rte_dmadev *dev, int vchan,
> +				  struct rte_dmadev_stats *stats);

First parameter can be "const"

> +
> +/** @internal Used to reset basic statistics. */
> +typedef int (*dmadev_stats_reset_t)(struct rte_dmadev *dev, int vchan);
> +
> +/** @internal Used to dump internal information. */
> +typedef int (*dmadev_dump_t)(struct rte_dmadev *dev, FILE *f);
> +

First param "const"

> +/** @internal Used to start dmadev selftest. */
> +typedef int (*dmadev_selftest_t)(uint16_t dev_id);
> +

This looks an outlier taking a dev_id. It should take a rawdev parameter.
Most drivers should not need to implement this anyway, as the main unit
tests should be in "test_dmadev.c" in the autotest app.

> +/** @internal Used to enqueue a copy operation. */
> +typedef int (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vchan,
> +			     rte_iova_t src, rte_iova_t dst,
> +			     uint32_t length, uint64_t flags);
> +
> +/** @internal Used to enqueue a scatter list copy operation. */
> +typedef int (*dmadev_copy_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
> +				const struct rte_dma_sg *sg,
> +				uint32_t sg_len, uint64_t flags);
> +
> +/** @internal Used to enqueue a fill operation. */
> +typedef int (*dmadev_fill_t)(struct rte_dmadev *dev, uint16_t vchan,
> +			     uint64_t pattern, rte_iova_t dst,
> +			     uint32_t length, uint64_t flags);
> +
> +/** @internal Used to enqueue a scatter list fill operation. */
> +typedef int (*dmadev_fill_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
> +			uint64_t pattern, const struct rte_dma_sg *sg,
> +			uint32_t sg_len, uint64_t flags);
> +
> +/** @internal Used to trigger hardware to begin working. */
> +typedef int (*dmadev_submit_t)(struct rte_dmadev *dev, uint16_t vchan);
> +
> +/** @internal Used to return number of successful completed operations. */
> +typedef uint16_t (*dmadev_completed_t)(struct rte_dmadev *dev, uint16_t vchan,
> +				       const uint16_t nb_cpls,
> +				       uint16_t *last_idx, bool *has_error);
> +
> +/** @internal Used to return number of failed completed operations. */
> +typedef uint16_t (*dmadev_completed_fails_t)(struct rte_dmadev *dev,
> +			uint16_t vchan, const uint16_t nb_status,
> +			uint32_t *status, uint16_t *last_idx);
> +
> +/**
> + * DMA device operations function pointer table
> + */
> +struct rte_dmadev_ops {
> +	dmadev_info_get_t dev_info_get;
> +	dmadev_configure_t dev_configure;
> +	dmadev_start_t dev_start;
> +	dmadev_stop_t dev_stop;
> +	dmadev_close_t dev_close;
> +	dmadev_reset_t dev_reset;
> +	dmadev_vchan_setup_t vchan_setup;
> +	dmadev_vchan_release_t vchan_release;
> +	dmadev_stats_get_t stats_get;
> +	dmadev_stats_reset_t stats_reset;
> +	dmadev_dump_t dev_dump;
> +	dmadev_selftest_t dev_selftest;
> +};
> +
> +/**
> + * @internal
> + * The data part, with no function pointers, associated with each DMA device.
> + *
> + * This structure is safe to place in shared memory to be common among different
> + * processes in a multi-process configuration.
> + */
> +struct rte_dmadev_data {
> +	uint16_t dev_id; /**< Device [external] identifier. */
> +	char dev_name[RTE_DMADEV_NAME_MAX_LEN]; /**< Unique identifier name */
> +	void *dev_private; /**< PMD-specific private data. */
> +	struct rte_dmadev_conf dev_conf; /**< DMA device configuration. */
> +	uint8_t dev_started : 1; /**< Device state: STARTED(1)/STOPPED(0). */
> +	uint64_t reserved[4]; /**< Reserved for future fields */
> +} __rte_cache_aligned;
> +

While I generally don't like having reserved space, this is one place where
it makes sense, so +1 for it here.

> +/**
> + * @internal
> + * The generic data structure associated with each DMA device.
> + *
> + * The dataplane APIs are located at the beginning of the structure, along
> + * with the pointer to where all the data elements for the particular device
> + * are stored in shared memory. This split scheme allows the function pointer
> + * and driver data to be per-process, while the actual configuration data for
> + * the device is shared.
> + */
> +struct rte_dmadev {
> +	dmadev_copy_t copy;
> +	dmadev_copy_sg_t copy_sg;
> +	dmadev_fill_t fill;
> +	dmadev_fill_sg_t fill_sg;
> +	dmadev_submit_t submit;
> +	dmadev_completed_t completed;
> +	dmadev_completed_fails_t completed_fails;
> +	const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD. */
> +	/** Flag indicating the device is attached: ATTACHED(1)/DETACHED(0). */
> +	uint8_t attached : 1;

Since it's in the midst of a series of pointers, this 1-bit flag is
actually using 8-bytes of space. Is it needed. Can we use dev_ops == NULL
or data == NULL instead to indicate this is a valid entry?

> +	/** Device info which supplied during device initialization. */
> +	struct rte_device *device;
> +	struct rte_dmadev_data *data; /**< Pointer to device data. */

If we are to try and minimise cacheline access, we should put this data
pointer - or even better a copy of data->private pointer - at the top of
the structure on the same cacheline as datapath operations. For dataplane,
I can't see any elements of data, except the private pointer being
accessed, so we would probably get most benefit for having a copy put there
on init of the dmadev struct.

> +	uint64_t reserved[4]; /**< Reserved for future fields */
> +} __rte_cache_aligned;
> +
> +extern struct rte_dmadev rte_dmadevices[];
> +
> +#endif /* _RTE_DMADEV_CORE_H_ */
> diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h
> new file mode 100644
> index 0000000..45141f9
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev_pmd.h
> @@ -0,0 +1,72 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 HiSilicon Limited.
> + */
> +
> +#ifndef _RTE_DMADEV_PMD_H_
> +#define _RTE_DMADEV_PMD_H_
> +
> +/**
> + * @file
> + *
> + * RTE DMA Device PMD APIs
> + *
> + * Driver facing APIs for a DMA device. These are not to be called directly by
> + * any application.
> + */
> +
> +#include "rte_dmadev.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @internal
> + * Allocates a new dmadev slot for an DMA device and returns the pointer
> + * to that slot for the driver to use.
> + *
> + * @param name
> + *   DMA device name.
> + *
> + * @return
> + *   A pointer to the DMA device slot case of success,
> + *   NULL otherwise.
> + */
> +__rte_internal
> +struct rte_dmadev *
> +rte_dmadev_pmd_allocate(const char *name);
> +
> +/**
> + * @internal
> + * Release the specified dmadev.
> + *
> + * @param dev
> + *   Device to be released.
> + *
> + * @return
> + *   - 0 on success, negative on error
> + */
> +__rte_internal
> +int
> +rte_dmadev_pmd_release(struct rte_dmadev *dev);
> +
> +/**
> + * @internal
> + * Return the DMA device based on the device name.
> + *
> + * @param name
> + *   DMA device name.
> + *
> + * @return
> + *   A pointer to the DMA device slot case of success,
> + *   NULL otherwise.
> + */
> +__rte_internal
> +struct rte_dmadev *
> +rte_dmadev_get_device_by_name(const char *name);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_DMADEV_PMD_H_ */
> diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map
> new file mode 100644
> index 0000000..0f099e7
> --- /dev/null
> +++ b/lib/dmadev/version.map
> @@ -0,0 +1,40 @@
> +EXPERIMENTAL {
> +	global:
> +
> +	rte_dmadev_count;
> +	rte_dmadev_info_get;
> +	rte_dmadev_configure;
> +	rte_dmadev_start;
> +	rte_dmadev_stop;
> +	rte_dmadev_close;
> +	rte_dmadev_reset;
> +	rte_dmadev_vchan_setup;
> +	rte_dmadev_vchan_release;
> +	rte_dmadev_stats_get;
> +	rte_dmadev_stats_reset;
> +	rte_dmadev_dump;
> +	rte_dmadev_selftest;
> +	rte_dmadev_copy;
> +	rte_dmadev_copy_sg;
> +	rte_dmadev_fill;
> +	rte_dmadev_fill_sg;
> +	rte_dmadev_submit;
> +	rte_dmadev_completed;
> +	rte_dmadev_completed_fails;
> +
> +	local: *;
> +};

The elements in the version.map file blocks should be sorted alphabetically.

> +
> +INTERNAL {
> +        global:
> +
> +	rte_dmadevices;
> +	rte_dmadev_pmd_allocate;
> +	rte_dmadev_pmd_release;
> +	rte_dmadev_get_device_by_name;
> +
> +	local:
> +
> +	rte_dmadev_is_valid_dev;
> +};
> +
> diff --git a/lib/meson.build b/lib/meson.build
> index 1673ca4..68d239f 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -60,6 +60,7 @@ libraries = [
>          'bpf',
>          'graph',
>          'node',
> +        'dmadev',
>  ]
>  
>  if is_windows
> -- 
> 2.8.1
>

Jerin Jacob July 12, 2021, 4:34 p.m. UTC | #10

On Mon, Jul 12, 2021 at 7:02 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Jul 12, 2021 at 03:29:27PM +0530, Jerin Jacob wrote:
> > On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> > >
> > > This patch introduce 'dmadevice' which is a generic type of DMA
> > > device.
> > >
> > > The APIs of dmadev library exposes some generic operations which can
> > > enable configuration and I/O with the DMA devices.
> > >
> > > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > > ---
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Enqueue a fill operation onto the virtual DMA channel.
> > > + *
> > > + * This queues up a fill operation to be performed by hardware, but does not
> > > + * trigger hardware to begin that operation.
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vchan
> > > + *   The identifier of virtual DMA channel.
> > > + * @param pattern
> > > + *   The pattern to populate the destination buffer with.
> > > + * @param dst
> > > + *   The address of the destination buffer.
> > > + * @param length
> > > + *   The length of the destination buffer.
> > > + * @param flags
> > > + *   An flags for this operation.
> > > + *
> > > + * @return
> > > + *   - 0..UINT16_MAX: index of enqueued copy job.
> >
> > fill job
> >
> > > + *   - <0: Error code returned by the driver copy function.
> > > + */
> > > +__rte_experimental
> > > +static inline int
> > > +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> > > +               rte_iova_t dst, uint32_t length, uint64_t flags)
> > > +{
> > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > +#ifdef RTE_DMADEV_DEBUG
> > > +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > > +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> > > +       if (vchan >= dev->data->dev_conf.max_vchans) {
> > > +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > > +               return -EINVAL;
> > > +       }
> > > +#endif
> > > +       return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> > > +}
> > > +
> >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Returns the number of operations that have been successfully completed.
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vchan
> > > + *   The identifier of virtual DMA channel.
> > > + * @param nb_cpls
> > > + *   The maximum number of completed operations that can be processed.
> > > + * @param[out] last_idx
> > > + *   The last completed operation's index.
> > > + *   If not required, NULL can be passed in.
> >
> > This means the driver will be tracking the last index.
> >
>
> Driver will be doing this anyway, no, since it needs to ensure we don't

Yes.

> wrap around?


>
> > Is that mean, the application needs to call this API periodically to
> > consume the completion slot.
> > I.e up to 64K (UINT16_MAX)  outstanding jobs are possible. If the
> > application fails to call this
> > >64K outstand job then the subsequence enqueue will fail.
>
> Well, given that there will be a regular enqueue ring which will probably
> be <= 64k in size, the completion call will need to be called frequently
> anyway. I don't think we need to document this restriction as it's fairly
> understood that you can't go beyond the size of the ring without cleanup.


See below.

>
> >
> > If so, we need to document this.
> >
> > One of the concerns of keeping UINT16_MAX as the limit is the
> > completion memory will always not in cache.
> > On the other hand, if we make this size programmable. it may introduce
> > complexity in the application.
> >
> > Thoughts?
>
> The reason for using powers-of-2 sizes, e.g. 0 .. UINT16_MAX, is that the
> ring can be any other power-of-2 size and we can index it just by masking.
> In the sample app for dmadev, I expect the ring size used to be set the
> same as the dmadev enqueue ring size, for simplicity.

No question on not using power of 2. Aligned on that.

At least in our HW, the size of the ring is rte_dmadev_vchan_conf::nb_desc.
But completion happens in _different_ memory space. Currently, we are allocating
UINT16_MAX entries to hold that. That's where cache miss aspects of
completion aspects
came.

In your case, Is completion happens in the same ring memory(looks like
one bit in job desc represents the job completed or not) ?
And when application calls rte_dmadev_completed(), You  are converting
UINT16_MAX based index to
rte_dmadev_vchan_conf::nb_desc. Right?


>
> In fact, I was thinking that in later versions we may be able to include
> some macros to help make this whole process easier, of converting indexes
> to arbitrary data structures. [The reason for using macros is so that the
> actual rings we are indexing can be of user-defined type, rather than just
> a ring of pointers].
>
> /Bruce

Bruce Richardson July 12, 2021, 5 p.m. UTC | #11

On Mon, Jul 12, 2021 at 10:04:07PM +0530, Jerin Jacob wrote:
> On Mon, Jul 12, 2021 at 7:02 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Mon, Jul 12, 2021 at 03:29:27PM +0530, Jerin Jacob wrote:
> > > On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> > > >
> > > > This patch introduce 'dmadevice' which is a generic type of DMA
> > > > device.
> > > >
> > > > The APIs of dmadev library exposes some generic operations which can
> > > > enable configuration and I/O with the DMA devices.
> > > >
> > > > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > > > ---
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Enqueue a fill operation onto the virtual DMA channel.
> > > > + *
> > > > + * This queues up a fill operation to be performed by hardware, but does not
> > > > + * trigger hardware to begin that operation.
> > > > + *
> > > > + * @param dev_id
> > > > + *   The identifier of the device.
> > > > + * @param vchan
> > > > + *   The identifier of virtual DMA channel.
> > > > + * @param pattern
> > > > + *   The pattern to populate the destination buffer with.
> > > > + * @param dst
> > > > + *   The address of the destination buffer.
> > > > + * @param length
> > > > + *   The length of the destination buffer.
> > > > + * @param flags
> > > > + *   An flags for this operation.
> > > > + *
> > > > + * @return
> > > > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > >
> > > fill job
> > >
> > > > + *   - <0: Error code returned by the driver copy function.
> > > > + */
> > > > +__rte_experimental
> > > > +static inline int
> > > > +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> > > > +               rte_iova_t dst, uint32_t length, uint64_t flags)
> > > > +{
> > > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > > +#ifdef RTE_DMADEV_DEBUG
> > > > +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > > > +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> > > > +       if (vchan >= dev->data->dev_conf.max_vchans) {
> > > > +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > > > +               return -EINVAL;
> > > > +       }
> > > > +#endif
> > > > +       return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> > > > +}
> > > > +
> > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Returns the number of operations that have been successfully completed.
> > > > + *
> > > > + * @param dev_id
> > > > + *   The identifier of the device.
> > > > + * @param vchan
> > > > + *   The identifier of virtual DMA channel.
> > > > + * @param nb_cpls
> > > > + *   The maximum number of completed operations that can be processed.
> > > > + * @param[out] last_idx
> > > > + *   The last completed operation's index.
> > > > + *   If not required, NULL can be passed in.
> > >
> > > This means the driver will be tracking the last index.
> > >
> >
> > Driver will be doing this anyway, no, since it needs to ensure we don't
> 
> Yes.
> 
> > wrap around?
> 
> 
> >
> > > Is that mean, the application needs to call this API periodically to
> > > consume the completion slot.
> > > I.e up to 64K (UINT16_MAX)  outstanding jobs are possible. If the
> > > application fails to call this
> > > >64K outstand job then the subsequence enqueue will fail.
> >
> > Well, given that there will be a regular enqueue ring which will probably
> > be <= 64k in size, the completion call will need to be called frequently
> > anyway. I don't think we need to document this restriction as it's fairly
> > understood that you can't go beyond the size of the ring without cleanup.
> 
> 
> See below.
> 
> >
> > >
> > > If so, we need to document this.
> > >
> > > One of the concerns of keeping UINT16_MAX as the limit is the
> > > completion memory will always not in cache.
> > > On the other hand, if we make this size programmable. it may introduce
> > > complexity in the application.
> > >
> > > Thoughts?
> >
> > The reason for using powers-of-2 sizes, e.g. 0 .. UINT16_MAX, is that the
> > ring can be any other power-of-2 size and we can index it just by masking.
> > In the sample app for dmadev, I expect the ring size used to be set the
> > same as the dmadev enqueue ring size, for simplicity.
> 
> No question on not using power of 2. Aligned on that.
> 
> At least in our HW, the size of the ring is rte_dmadev_vchan_conf::nb_desc.
> But completion happens in _different_ memory space. Currently, we are allocating
> UINT16_MAX entries to hold that. That's where cache miss aspects of
> completion aspects
> came.

Depending on HW, our completions can be written back to a separate memory
area - a completion ring, if you will - but I've generally found it works
as well to reuse the enqueue ring for that purpose. However, with a
separate memory area for completions, why do you need to allocate 64K
entries for the completions? Would nb_desc entries not be enough? Is that
to allow the user to have more than nb_desc jobs outstanding before calling
"get_completions" API?

> 
> In your case, Is completion happens in the same ring memory(looks like
> one bit in job desc represents the job completed or not) ?
> And when application calls rte_dmadev_completed(), You  are converting
> UINT16_MAX based index to
> rte_dmadev_vchan_conf::nb_desc. Right?

Yes, we are masking to do that. Actually, for simplicity and perf we should
only allow power-of-2 ring sizes. Having to use modulus instead of masking
could be a problem. [Alternatively, I suppose we can allow drivers to round
up the ring sizes to the next power of 2, but I prefer just documenting it
as a limitation].

/Bruce

Jerin Jacob July 13, 2021, 8:59 a.m. UTC | #12

On Mon, Jul 12, 2021 at 10:30 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Jul 12, 2021 at 10:04:07PM +0530, Jerin Jacob wrote:
> > On Mon, Jul 12, 2021 at 7:02 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Mon, Jul 12, 2021 at 03:29:27PM +0530, Jerin Jacob wrote:
> > > > On Sun, Jul 11, 2021 at 2:59 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> > > > >
> > > > > This patch introduce 'dmadevice' which is a generic type of DMA
> > > > > device.
> > > > >
> > > > > The APIs of dmadev library exposes some generic operations which can
> > > > > enable configuration and I/O with the DMA devices.
> > > > >
> > > > > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > > > > ---
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > > + *
> > > > > + * Enqueue a fill operation onto the virtual DMA channel.
> > > > > + *
> > > > > + * This queues up a fill operation to be performed by hardware, but does not
> > > > > + * trigger hardware to begin that operation.
> > > > > + *
> > > > > + * @param dev_id
> > > > > + *   The identifier of the device.
> > > > > + * @param vchan
> > > > > + *   The identifier of virtual DMA channel.
> > > > > + * @param pattern
> > > > > + *   The pattern to populate the destination buffer with.
> > > > > + * @param dst
> > > > > + *   The address of the destination buffer.
> > > > > + * @param length
> > > > > + *   The length of the destination buffer.
> > > > > + * @param flags
> > > > > + *   An flags for this operation.
> > > > > + *
> > > > > + * @return
> > > > > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > > >
> > > > fill job
> > > >
> > > > > + *   - <0: Error code returned by the driver copy function.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static inline int
> > > > > +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> > > > > +               rte_iova_t dst, uint32_t length, uint64_t flags)
> > > > > +{
> > > > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > > > +#ifdef RTE_DMADEV_DEBUG
> > > > > +       RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > > > > +       RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> > > > > +       if (vchan >= dev->data->dev_conf.max_vchans) {
> > > > > +               RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +#endif
> > > > > +       return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> > > > > +}
> > > > > +
> > > >
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > > + *
> > > > > + * Returns the number of operations that have been successfully completed.
> > > > > + *
> > > > > + * @param dev_id
> > > > > + *   The identifier of the device.
> > > > > + * @param vchan
> > > > > + *   The identifier of virtual DMA channel.
> > > > > + * @param nb_cpls
> > > > > + *   The maximum number of completed operations that can be processed.
> > > > > + * @param[out] last_idx
> > > > > + *   The last completed operation's index.
> > > > > + *   If not required, NULL can be passed in.
> > > >
> > > > This means the driver will be tracking the last index.
> > > >
> > >
> > > Driver will be doing this anyway, no, since it needs to ensure we don't
> >
> > Yes.
> >
> > > wrap around?
> >
> >
> > >
> > > > Is that mean, the application needs to call this API periodically to
> > > > consume the completion slot.
> > > > I.e up to 64K (UINT16_MAX)  outstanding jobs are possible. If the
> > > > application fails to call this
> > > > >64K outstand job then the subsequence enqueue will fail.
> > >
> > > Well, given that there will be a regular enqueue ring which will probably
> > > be <= 64k in size, the completion call will need to be called frequently
> > > anyway. I don't think we need to document this restriction as it's fairly
> > > understood that you can't go beyond the size of the ring without cleanup.
> >
> >
> > See below.
> >
> > >
> > > >
> > > > If so, we need to document this.
> > > >
> > > > One of the concerns of keeping UINT16_MAX as the limit is the
> > > > completion memory will always not in cache.
> > > > On the other hand, if we make this size programmable. it may introduce
> > > > complexity in the application.
> > > >
> > > > Thoughts?
> > >
> > > The reason for using powers-of-2 sizes, e.g. 0 .. UINT16_MAX, is that the
> > > ring can be any other power-of-2 size and we can index it just by masking.
> > > In the sample app for dmadev, I expect the ring size used to be set the
> > > same as the dmadev enqueue ring size, for simplicity.
> >
> > No question on not using power of 2. Aligned on that.
> >
> > At least in our HW, the size of the ring is rte_dmadev_vchan_conf::nb_desc.
> > But completion happens in _different_ memory space. Currently, we are allocating
> > UINT16_MAX entries to hold that. That's where cache miss aspects of
> > completion aspects
> > came.
>
> Depending on HW, our completions can be written back to a separate memory
> area - a completion ring, if you will - but I've generally found it works
> as well to reuse the enqueue ring for that purpose. However, with a
> separate memory area for completions, why do you need to allocate 64K
> entries for the completions? Would nb_desc entries not be enough? Is that
> to allow the user to have more than nb_desc jobs outstanding before calling
> "get_completions" API?

Yes. That's what I thought. Thats where my question on what is the max number of
outstanding completions. I thought it can be up to 64K. Agree to keep
it implementation-specific and not need to highlight this in the
documentation.


>
> >
> > In your case, Is completion happens in the same ring memory(looks like
> > one bit in job desc represents the job completed or not) ?
> > And when application calls rte_dmadev_completed(), You  are converting
> > UINT16_MAX based index to
> > rte_dmadev_vchan_conf::nb_desc. Right?
>
> Yes, we are masking to do that. Actually, for simplicity and perf we should
> only allow power-of-2 ring sizes. Having to use modulus instead of masking
> could be a problem. [Alternatively, I suppose we can allow drivers to round
> up the ring sizes to the next power of 2, but I prefer just documenting it
> as a limitation].

OK.

>
> /Bruce

Jerin Jacob July 13, 2021, 9:07 a.m. UTC | #13

On Mon, Jul 12, 2021 at 9:21 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Sun, Jul 11, 2021 at 05:25:56PM +0800, Chengwen Feng wrote:
> > This patch introduce 'dmadevice' which is a generic type of DMA
> > device.
> >
> > The APIs of dmadev library exposes some generic operations which can
> > enable configuration and I/O with the DMA devices.
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>
> Hi again,
>
> some further review comments inline.
>
> /Bruce
>
> > ---
> >  MAINTAINERS                  |    4 +
> >  config/rte_config.h          |    3 +
> >  lib/dmadev/meson.build       |    6 +
> >  lib/dmadev/rte_dmadev.c      |  560 +++++++++++++++++++++++
> >  lib/dmadev/rte_dmadev.h      | 1030 ++++++++++++++++++++++++++++++++++++++++++
> >  lib/dmadev/rte_dmadev_core.h |  159 +++++++
> >  lib/dmadev/rte_dmadev_pmd.h  |   72 +++
> >  lib/dmadev/version.map       |   40 ++
> >  lib/meson.build              |    1 +
>
> <snip>
>
> > diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
> > new file mode 100644
> > index 0000000..8779512
> > --- /dev/null
> > +++ b/lib/dmadev/rte_dmadev.h
> > @@ -0,0 +1,1030 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2021 HiSilicon Limited.
> > + * Copyright(c) 2021 Intel Corporation.
> > + * Copyright(c) 2021 Marvell International Ltd.
> > + */
> > +
> > +#ifndef _RTE_DMADEV_H_
> > +#define _RTE_DMADEV_H_
> > +
> > +/**
> > + * @file rte_dmadev.h
> > + *
> > + * RTE DMA (Direct Memory Access) device APIs.
> > + *
> > + * The DMA framework is built on the following model:
> > + *
> > + *     ---------------   ---------------       ---------------
> > + *     | virtual DMA |   | virtual DMA |       | virtual DMA |
> > + *     | channel     |   | channel     |       | channel     |
> > + *     ---------------   ---------------       ---------------
> > + *            |                |                      |
> > + *            ------------------                      |
> > + *                     |                              |
> > + *               ------------                    ------------
> > + *               |  dmadev  |                    |  dmadev  |
> > + *               ------------                    ------------
> > + *                     |                              |
> > + *            ------------------               ------------------
> > + *            | HW-DMA-channel |               | HW-DMA-channel |
> > + *            ------------------               ------------------
> > + *                     |                              |
> > + *                     --------------------------------
> > + *                                     |
> > + *                           ---------------------
> > + *                           | HW-DMA-Controller |
> > + *                           ---------------------
> > + *
> > + * The DMA controller could have multilpe HW-DMA-channels (aka. HW-DMA-queues),
> > + * each HW-DMA-channel should be represented by a dmadev.
> > + *
> > + * The dmadev could create multiple virtual DMA channel, each virtual DMA
> > + * channel represents a different transfer context. The DMA operation request
> > + * must be submitted to the virtual DMA channel.
> > + * E.G. Application could create virtual DMA channel 0 for mem-to-mem transfer
> > + *      scenario, and create virtual DMA channel 1 for mem-to-dev transfer
> > + *      scenario.
> > + *
> > + * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the
> > + * PCI/SoC device probing phase performed at EAL initialization time. And could
> > + * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing
> > + * phase.
> > + *
> > + * We use 'uint16_t dev_id' as the device identifier of a dmadev, and
> > + * 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev.
> > + *
> > + * The functions exported by the dmadev API to setup a device designated by its
> > + * device identifier must be invoked in the following order:
> > + *     - rte_dmadev_configure()
> > + *     - rte_dmadev_vchan_setup()
> > + *     - rte_dmadev_start()
> > + *
> > + * Then, the application can invoke dataplane APIs to process jobs.
> > + *
> > + * If the application wants to change the configuration (i.e. call
> > + * rte_dmadev_configure()), it must call rte_dmadev_stop() first to stop the
> > + * device and then do the reconfiguration before calling rte_dmadev_start()
> > + * again. The dataplane APIs should not be invoked when the device is stopped.
> > + *
> > + * Finally, an application can close a dmadev by invoking the
> > + * rte_dmadev_close() function.
> > + *
> > + * The dataplane APIs include two parts:
> > + *   a) The first part is the submission of operation requests:
> > + *        - rte_dmadev_copy()
> > + *        - rte_dmadev_copy_sg() - scatter-gather form of copy
> > + *        - rte_dmadev_fill()
> > + *        - rte_dmadev_fill_sg() - scatter-gather form of fill
> > + *        - rte_dmadev_perform() - issue doorbell to hardware
> > + *      These APIs could work with different virtual DMA channels which have
> > + *      different contexts.
> > + *      The first four APIs are used to submit the operation request to the
> > + *      virtual DMA channel, if the submission is successful, a uint16_t
> > + *      ring_idx is returned, otherwise a negative number is returned.
> > + *   b) The second part is to obtain the result of requests:
> > + *        - rte_dmadev_completed()
> > + *            - return the number of operation requests completed successfully.
> > + *        - rte_dmadev_completed_fails()
> > + *            - return the number of operation requests failed to complete.
>
> Please rename this to "completed_status" to allow the return of information
> other than just errors. As I suggested before, I think this should also be
> usable as a slower version of "completed" even in the case where there are
> no errors, in that it returns status information for each and every job
> rather than just returning as soon as it hits a failure.
>
> > + * + * About the ring_idx which rte_dmadev_copy/copy_sg/fill/fill_sg()
> > returned, + * the rules are as follows: + *   a) ring_idx for each
> > virtual DMA channel are independent.  + *   b) For a virtual DMA channel,
> > the ring_idx is monotonically incremented, + *      when it reach
> > UINT16_MAX, it wraps back to zero.
>
> Based on other feedback, I suggest we put in the detail here that: "This
> index can be used by applications to track per-job metadata in an
> application-defined circular ring, where the ring is a power-of-2 size, and
> the indexes are masked appropriately."
>
> > + *   c) The initial ring_idx of a virtual DMA channel is zero, after the device
> > + *      is stopped or reset, the ring_idx needs to be reset to zero.
> > + *   Example:
> > + *      step-1: start one dmadev
> > + *      step-2: enqueue a copy operation, the ring_idx return is 0
> > + *      step-3: enqueue a copy operation again, the ring_idx return is 1
> > + *      ...
> > + *      step-101: stop the dmadev
> > + *      step-102: start the dmadev
> > + *      step-103: enqueue a copy operation, the cookie return is 0
> > + *      ...
> > + *      step-x+0: enqueue a fill operation, the ring_idx return is 65535
> > + *      step-x+1: enqueue a copy operation, the ring_idx return is 0
> > + *      ...
> > + *
> > + * By default, all the non-dataplane functions of the dmadev API exported by a
> > + * PMD are lock-free functions which assume to not be invoked in parallel on
> > + * different logical cores to work on the same target object.
> > + *
> > + * The dataplane functions of the dmadev API exported by a PMD can be MT-safe
> > + * only when supported by the driver, generally, the driver will reports two
> > + * capabilities:
> > + *   a) Whether to support MT-safe for the submit/completion API of the same
> > + *      virtual DMA channel.
> > + *      E.G. one thread do submit operation, another thread do completion
> > + *           operation.
> > + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VCHAN.
> > + *      If driver don't support it, it's up to the application to guarantee
> > + *      MT-safe.
> > + *   b) Whether to support MT-safe for different virtual DMA channels.
> > + *      E.G. one thread do operation on virtual DMA channel 0, another thread
> > + *           do operation on virtual DMA channel 1.
> > + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
> > + *      If driver don't support it, it's up to the application to guarantee
> > + *      MT-safe.
> > + *
> > + */
>
> Just to check - do we have hardware that currently supports these
> capabilities? For Intel HW, we will only support one virtual channel per
> device without any MT-safety guarantees, so won't be setting either of
> these flags. If any of these flags are unused in all planned drivers, we
> should drop them from the spec until they prove necessary. Idealy,
> everything in the dmadev definition should be testable, and features unused
> by anyone obviously will be untested.
>
> > +
> > +#include <rte_common.h>
> > +#include <rte_compat.h>
> > +#include <rte_errno.h>
> > +#include <rte_memory.h>
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#define RTE_DMADEV_NAME_MAX_LEN      RTE_DEV_NAME_MAX_LEN
> > +
> > +extern int rte_dmadev_logtype;
> > +
> > +#define RTE_DMADEV_LOG(level, ...) \
> > +     rte_log(RTE_LOG_ ## level, rte_dmadev_logtype, "" __VA_ARGS__)
> > +
> > +/* Macros to check for valid port */
> > +#define RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, retval) do { \
> > +     if (!rte_dmadev_is_valid_dev(dev_id)) { \
> > +             RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
> > +             return retval; \
> > +     } \
> > +} while (0)
> > +
> > +#define RTE_DMADEV_VALID_DEV_ID_OR_RET(dev_id) do { \
> > +     if (!rte_dmadev_is_valid_dev(dev_id)) { \
> > +             RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
> > +             return; \
> > +     } \
> > +} while (0)
> > +
>
> Can we avoid using these in the inline functions in this file, and move
> them to the _pmd.h which is for internal PMD use only? It would mean we
> don't get logging from the key dataplane functions, but I would hope the
> return values would provide enough info.
>
> Alternatively, can we keep the logtype definition and first macro and move
> the other two to the _pmd.h file.
>
> > +/**
> > + * @internal
> > + * Validate if the DMA device index is a valid attached DMA device.
> > + *
> > + * @param dev_id
> > + *   DMA device index.
> > + *
> > + * @return
> > + *   - If the device index is valid (true) or not (false).
> > + */
> > +__rte_internal
> > +bool
> > +rte_dmadev_is_valid_dev(uint16_t dev_id);
> > +
> > +/**
> > + * rte_dma_sg - can hold scatter DMA operation request
> > + */
> > +struct rte_dma_sg {
> > +     rte_iova_t src;
> > +     rte_iova_t dst;
> > +     uint32_t length;
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Get the total number of DMA devices that have been successfully
> > + * initialised.
> > + *
> > + * @return
> > + *   The total number of usable DMA devices.
> > + */
> > +__rte_experimental
> > +uint16_t
> > +rte_dmadev_count(void);
> > +
> > +/**
> > + * The capabilities of a DMA device
> > + */
> > +#define RTE_DMA_DEV_CAPA_MEM_TO_MEM  (1ull << 0)
> > +/**< DMA device support mem-to-mem transfer.
>
> Do we need this? Can we assume that any device appearing as a dmadev can
> do mem-to-mem copies, and drop the capability for mem-to-mem and the
> capability for copying?
>
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_MEM_TO_DEV  (1ull << 1)
> > +/**< DMA device support slave mode & mem-to-dev transfer.
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_DEV_TO_MEM  (1ull << 2)
> > +/**< DMA device support slave mode & dev-to-mem transfer.
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_DEV_TO_DEV  (1ull << 3)
> > +/**< DMA device support slave mode & dev-to-dev transfer.
> > + *
>
> Just to confirm, are there devices currently planned for dmadev that

We are planning to use this support as our exiting raw driver has this.

> supports only a subset of these flags? Thinking particularly of the
> dev-2-mem and mem-2-dev ones here - do any of the devices we are
> considering not support using device memory?
> [Again, just want to ensure we aren't adding too much stuff that we don't
> need yet]



>
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_OPS_COPY    (1ull << 4)
> > +/**< DMA device support copy ops.
> > + *
>
> Suggest dropping this and making it min for dmadev.
>
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_OPS_FILL    (1ull << 5)
> > +/**< DMA device support fill ops.
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_OPS_SG              (1ull << 6)
> > +/**< DMA device support scatter-list ops.
> > + * If device support ops_copy and ops_sg, it means supporting copy_sg ops.
> > + * If device support ops_fill and ops_sg, it means supporting fill_sg ops.
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_FENCE               (1ull << 7)
> > +/**< DMA device support fence.
> > + * If device support fence, then application could set a fence flags when
> > + * enqueue operation by rte_dma_copy/copy_sg/fill/fill_sg.
> > + * If a operation has a fence flags, it means the operation must be processed
> > + * only after all previous operations are completed.
> > + *
>
> Is this needed? As I understand it, the Marvell driver doesn't require
> fences so providing one is a no-op. Therefore, this flag is probably
> unnecessary.

+1

>
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_SVA         (1ull << 8)
> > +/**< DMA device support SVA which could use VA as DMA address.
> > + * If device support SVA then application could pass any VA address like memory
> > + * from rte_malloc(), rte_memzone(), malloc, stack memory.
> > + * If device don't support SVA, then application should pass IOVA address which
> > + * from rte_malloc(), rte_memzone().
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_MT_VCHAN    (1ull << 9)
> > +/**< DMA device support MT-safe of a virtual DMA channel.
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
> > +#define RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN      (1ull << 10)
> > +/**< DMA device support MT-safe of different virtual DMA channels.
> > + *
> > + * @see struct rte_dmadev_info::dev_capa
> > + */
>
> As with comments above - let's check that these will actually be used
> before we add them.
>
> > +
> > +/**
> > + * A structure used to retrieve the contextual information of
> > + * an DMA device
> > + */
> > +struct rte_dmadev_info {
> > +     struct rte_device *device; /**< Generic Device information */
> > +     uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
> > +     /** Maximum number of virtual DMA channels supported */
> > +     uint16_t max_vchans;
> > +     /** Maximum allowed number of virtual DMA channel descriptors */
> > +     uint16_t max_desc;
> > +     /** Minimum allowed number of virtual DMA channel descriptors */
> > +     uint16_t min_desc;
> > +     uint16_t nb_vchans; /**< Number of virtual DMA channel configured */
> > +};
>
> Let's add rte_dmadev_conf struct into this to return the configuration
> settings.
>
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Retrieve the contextual information of a DMA device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param[out] dev_info
> > + *   A pointer to a structure of type *rte_dmadev_info* to be filled with the
> > + *   contextual information of the device.
> > + *
> > + * @return
> > + *   - =0: Success, driver updates the contextual information of the DMA device
> > + *   - <0: Error code returned by the driver info get function.
> > + *
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info);
> > +
>
> Should have "const" on second param.
>
> > +/**
> > + * A structure used to configure a DMA device.
> > + */
> > +struct rte_dmadev_conf {
> > +     /** Maximum number of virtual DMA channel to use.
> > +      * This value cannot be greater than the field 'max_vchans' of struct
> > +      * rte_dmadev_info which get from rte_dmadev_info_get().
> > +      */
> > +     uint16_t max_vchans;
> > +     /** Enable bit for MT-safe of a virtual DMA channel.
> > +      * This bit can be enabled only when the device supports
> > +      * RTE_DMA_DEV_CAPA_MT_VCHAN.
> > +      * @see RTE_DMA_DEV_CAPA_MT_VCHAN
> > +      */
> > +     uint8_t enable_mt_vchan : 1;
> > +     /** Enable bit for MT-safe of different virtual DMA channels.
> > +      * This bit can be enabled only when the device supports
> > +      * RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
> > +      * @see RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN
> > +      */
> > +     uint8_t enable_mt_multi_vchan : 1;
> > +     uint64_t reserved[2]; /**< Reserved for future fields */
> > +};
>
> Drop the reserved fields. ABI versioning is a better way to deal with
> adding new fields.

+1

>
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Configure a DMA device.
> > + *
> > + * This function must be invoked first before any other function in the
> > + * API. This function can also be re-invoked when a device is in the
> > + * stopped state.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device to configure.
> > + * @param dev_conf
> > + *   The DMA device configuration structure encapsulated into rte_dmadev_conf
> > + *   object.
> > + *
> > + * @return
> > + *   - =0: Success, device configured.
> > + *   - <0: Error code returned by the driver configuration function.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Start a DMA device.
> > + *
> > + * The device start step is the last one and consists of setting the DMA
> > + * to start accepting jobs.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @return
> > + *   - =0: Success, device started.
> > + *   - <0: Error code returned by the driver start function.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_start(uint16_t dev_id);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Stop a DMA device.
> > + *
> > + * The device can be restarted with a call to rte_dmadev_start()
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @return
> > + *   - =0: Success, device stopped.
> > + *   - <0: Error code returned by the driver stop function.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_stop(uint16_t dev_id);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Close a DMA device.
> > + *
> > + * The device cannot be restarted after this call.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @return
> > + *  - =0: Successfully close device
> > + *  - <0: Failure to close device
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_close(uint16_t dev_id);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Reset a DMA device.
> > + *
> > + * This is different from cycle of rte_dmadev_start->rte_dmadev_stop in the
> > + * sense similar to hard or soft reset.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @return
> > + *   - =0: Successfully reset device.
> > + *   - <0: Failure to reset device.
> > + *   - (-ENOTSUP): If the device doesn't support this function.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_reset(uint16_t dev_id);
> > +
> > +/**
> > + * DMA transfer direction defines.
> > + */
> > +#define RTE_DMA_MEM_TO_MEM   (1ull << 0)
> > +/**< DMA transfer direction - from memory to memory.
> > + *
> > + * @see struct rte_dmadev_vchan_conf::direction
> > + */
> > +#define RTE_DMA_MEM_TO_DEV   (1ull << 1)
> > +/**< DMA transfer direction - slave mode & from memory to device.
> > + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> > + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> > + * request from ARM memory to x86 host memory.
>
> For clarity, it would be good to specify in the scenario described which
> memory is the "mem" and which is the "dev" (I assume SoC memory is "mem"
> and x86 host memory is "dev"??)
>
> > + *
> > + * @see struct rte_dmadev_vchan_conf::direction
> > + */
> > +#define RTE_DMA_DEV_TO_MEM   (1ull << 2)
> > +/**< DMA transfer direction - slave mode & from device to memory.
> > + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> > + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> > + * request from x86 host memory to ARM memory.
> > + *
> > + * @see struct rte_dmadev_vchan_conf::direction
> > + */
> > +#define RTE_DMA_DEV_TO_DEV   (1ull << 3)
> > +/**< DMA transfer direction - slave mode & from device to device.
> > + * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
> > + * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
> > + * request from x86 host memory to another x86 host memory.
> > + *
> > + * @see struct rte_dmadev_vchan_conf::direction
> > + */
> > +#define RTE_DMA_TRANSFER_DIR_ALL     (RTE_DMA_MEM_TO_MEM | \
> > +                                      RTE_DMA_MEM_TO_DEV | \
> > +                                      RTE_DMA_DEV_TO_MEM | \
> > +                                      RTE_DMA_DEV_TO_DEV)
> > +
> > +/**
> > + * enum rte_dma_slave_port_type - slave mode type defines
> > + */
> > +enum rte_dma_slave_port_type {
> > +     /** The slave port is PCIE. */
> > +     RTE_DMA_SLAVE_PORT_PCIE = 1,
> > +};
> > +
>
> As previously mentioned, this needs to be updated to use other terms.
> For some suggested alternatives see:
> https://doc.dpdk.org/guides-21.05/contributing/coding_style.html#naming
>
> > +/**
> > + * A structure used to descript slave port parameters.
> > + */
> > +struct rte_dma_slave_port_parameters {
> > +     enum rte_dma_slave_port_type port_type;
> > +     union {
> > +             /** For PCIE port */
> > +             struct {
> > +                     /** The physical function number which to use */
> > +                     uint64_t pf_number : 6;
> > +                     /** Virtual function enable bit */
> > +                     uint64_t vf_enable : 1;
> > +                     /** The virtual function number which to use */
> > +                     uint64_t vf_number : 8;
> > +                     uint64_t pasid : 20;
> > +                     /** The attributes filed in TLP packet */
> > +                     uint64_t tlp_attr : 3;
> > +             };
> > +     };
> > +};
> > +
> > +/**
> > + * A structure used to configure a virtual DMA channel.
> > + */
> > +struct rte_dmadev_vchan_conf {
> > +     uint8_t direction; /**< Set of supported transfer directions */
> > +     /** Number of descriptor for the virtual DMA channel */
> > +     uint16_t nb_desc;
> > +     /** 1) Used to describes the dev parameter in the mem-to-dev/dev-to-mem
> > +      * transfer scenario.
> > +      * 2) Used to describes the src dev parameter in the dev-to-dev
> > +      * transfer scenario.
> > +      */
> > +     struct rte_dma_slave_port_parameters port;
> > +     /** Used to describes the dst dev parameters in the dev-to-dev
> > +      * transfer scenario.
> > +      */
> > +     struct rte_dma_slave_port_parameters peer_port;
> > +     uint64_t reserved[2]; /**< Reserved for future fields */
> > +};
>
> Let's drop the reserved fields and use ABI versioning if necesssary in
> future.
>
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Allocate and set up a virtual DMA channel.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param conf
> > + *   The virtual DMA channel configuration structure encapsulated into
> > + *   rte_dmadev_vchan_conf object.
> > + *
> > + * @return
> > + *   - >=0: Allocate success, it is the virtual DMA channel id. This value must
> > + *          be less than the field 'max_vchans' of struct rte_dmadev_conf
> > +         which configured by rte_dmadev_configure().
>
> nit: whitespace error here.
>
> > + *   - <0: Error code returned by the driver virtual channel setup function.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_vchan_setup(uint16_t dev_id,
> > +                    const struct rte_dmadev_vchan_conf *conf);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Release a virtual DMA channel.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel which return by vchan setup.
> > + *
> > + * @return
> > + *   - =0: Successfully release the virtual DMA channel.
> > + *   - <0: Error code returned by the driver virtual channel release function.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
> > +
> > +/**
> > + * rte_dmadev_stats - running statistics.
> > + */
> > +struct rte_dmadev_stats {
> > +     /** Count of operations which were successfully enqueued */
> > +     uint64_t enqueued_count;
> > +     /** Count of operations which were submitted to hardware */
> > +     uint64_t submitted_count;
> > +     /** Count of operations which failed to complete */
> > +     uint64_t completed_fail_count;
> > +     /** Count of operations which successfully complete */
> > +     uint64_t completed_count;
> > +     uint64_t reserved[4]; /**< Reserved for future fields */
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Retrieve basic statistics of a or all virtual DMA channel(s).
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel, -1 means all channels.
> > + * @param[out] stats
> > + *   The basic statistics structure encapsulated into rte_dmadev_stats
> > + *   object.
> > + *
> > + * @return
> > + *   - =0: Successfully retrieve stats.
> > + *   - <0: Failure to retrieve stats.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_stats_get(uint16_t dev_id, int vchan,
>
> vchan as uint16_t rather than int, I think. This would apply to all
> dataplane functions. There is no need for a signed vchan value.
>
> > +                  struct rte_dmadev_stats *stats);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Reset basic statistics of a or all virtual DMA channel(s).
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel, -1 means all channels.
> > + *
> > + * @return
> > + *   - =0: Successfully reset stats.
> > + *   - <0: Failure to reset stats.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_stats_reset(uint16_t dev_id, int vchan);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Dump DMA device info.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param f
> > + *   The file to write the output to.
> > + *
> > + * @return
> > + *   0 on success. Non-zero otherwise.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_dump(uint16_t dev_id, FILE *f);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Trigger the dmadev self test.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @return
> > + *   - 0: Selftest successful.
> > + *   - -ENOTSUP if the device doesn't support selftest
> > + *   - other values < 0 on failure.
> > + */
> > +__rte_experimental
> > +int
> > +rte_dmadev_selftest(uint16_t dev_id);
>
> I don't think this needs to be in the public API, since it should only be
> for the autotest app to use. Maybe move the prototype to the _pmd.h (since
> we don't have a separate internal header), and then the autotest app can
> pick it up from there.
>
> > +
> > +#include "rte_dmadev_core.h"
> > +
> > +/**
> > + *  DMA flags to augment operation preparation.
> > + *  Used as the 'flags' parameter of rte_dmadev_copy/copy_sg/fill/fill_sg.
> > + */
> > +#define RTE_DMA_FLAG_FENCE   (1ull << 0)
> > +/**< DMA fence flag
> > + * It means the operation with this flag must be processed only after all
> > + * previous operations are completed.
> > + *
> > + * @see rte_dmadev_copy()
> > + * @see rte_dmadev_copy_sg()
> > + * @see rte_dmadev_fill()
> > + * @see rte_dmadev_fill_sg()
> > + */
>
> As a general comment, I think all these multi-line comments should go
> before the item they describe. Comments after should only be used in the
> case where the comment fits on the rest of the line after a value.
>
> We also should define the SUBMIT flag as suggested by Jerin, to allow apps
> to automatically submit jobs after enqueue.
>
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a copy operation onto the virtual DMA channel.
> > + *
> > + * This queues up a copy operation to be performed by hardware, but does not
> > + * trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param src
> > + *   The address of the source buffer.
> > + * @param dst
> > + *   The address of the destination buffer.
> > + * @param length
> > + *   The length of the data to be copied.
> > + * @param flags
> > + *   An flags for this operation.
> > + *
> > + * @return
> > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > + *   - <0: Error code returned by the driver copy function.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
> > +             uint32_t length, uint64_t flags)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->copy, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +     return (*dev->copy)(dev, vchan, src, dst, length, flags);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a scatter list copy operation onto the virtual DMA channel.
> > + *
> > + * This queues up a scatter list copy operation to be performed by hardware,
> > + * but does not trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param sg
> > + *   The pointer of scatterlist.
> > + * @param sg_len
> > + *   The number of scatterlist elements.
> > + * @param flags
> > + *   An flags for this operation.
> > + *
> > + * @return
> > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > + *   - <0: Error code returned by the driver copy function.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
> > +                uint32_t sg_len, uint64_t flags)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(sg, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->copy_sg, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +     return (*dev->copy_sg)(dev, vchan, sg, sg_len, flags);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a fill operation onto the virtual DMA channel.
> > + *
> > + * This queues up a fill operation to be performed by hardware, but does not
> > + * trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param pattern
> > + *   The pattern to populate the destination buffer with.
> > + * @param dst
> > + *   The address of the destination buffer.
> > + * @param length
> > + *   The length of the destination buffer.
> > + * @param flags
> > + *   An flags for this operation.
> > + *
> > + * @return
> > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > + *   - <0: Error code returned by the driver copy function.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> > +             rte_iova_t dst, uint32_t length, uint64_t flags)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +     return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a scatter list fill operation onto the virtual DMA channel.
> > + *
> > + * This queues up a scatter list fill operation to be performed by hardware,
> > + * but does not trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param pattern
> > + *   The pattern to populate the destination buffer with.
> > + * @param sg
> > + *   The pointer of scatterlist.
> > + * @param sg_len
> > + *   The number of scatterlist elements.
> > + * @param flags
> > + *   An flags for this operation.
> > + *
> > + * @return
> > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > + *   - <0: Error code returned by the driver copy function.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_fill_sg(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
> > +                const struct rte_dma_sg *sg, uint32_t sg_len,
> > +                uint64_t flags)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(sg, -ENOTSUP);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +     return (*dev->fill_sg)(dev, vchan, pattern, sg, sg_len, flags);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Trigger hardware to begin performing enqueued operations.
> > + *
> > + * This API is used to write the "doorbell" to the hardware to trigger it
> > + * to begin the operations previously enqueued by rte_dmadev_copy/fill()
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + *
> > + * @return
> > + *   - =0: Successfully trigger hardware.
> > + *   - <0: Failure to trigger hardware.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_submit(uint16_t dev_id, uint16_t vchan)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->submit, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +     return (*dev->submit)(dev, vchan);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Returns the number of operations that have been successfully completed.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param nb_cpls
> > + *   The maximum number of completed operations that can be processed.
> > + * @param[out] last_idx
> > + *   The last completed operation's index.
> > + *   If not required, NULL can be passed in.
> > + * @param[out] has_error
> > + *   Indicates if there are transfer error.
> > + *   If not required, NULL can be passed in.
> > + *
> > + * @return
> > + *   The number of operations that successfully completed.
> > + */
> > +__rte_experimental
> > +static inline uint16_t
> > +rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
> > +                  uint16_t *last_idx, bool *has_error)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +     uint16_t idx;
> > +     bool err;
> > +
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->completed, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +     if (nb_cpls == 0) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid nb_cpls\n");
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +
> > +     /* Ensure the pointer values are non-null to simplify drivers.
> > +      * In most cases these should be compile time evaluated, since this is
> > +      * an inline function.
> > +      * - If NULL is explicitly passed as parameter, then compiler knows the
> > +      *   value is NULL
> > +      * - If address of local variable is passed as parameter, then compiler
> > +      *   can know it's non-NULL.
> > +      */
> > +     if (last_idx == NULL)
> > +             last_idx = &idx;
> > +     if (has_error == NULL)
> > +             has_error = &err;
> > +
> > +     *has_error = false;
> > +     return (*dev->completed)(dev, vchan, nb_cpls, last_idx, has_error);
> > +}
> > +
> > +/**
> > + * DMA transfer status code defines
> > + */
> > +enum rte_dma_status_code {
> > +     /** The operation completed successfully */
> > +     RTE_DMA_STATUS_SUCCESSFUL = 0,
> > +     /** The operation failed to complete due active drop
> > +      * This is mainly used when processing dev_stop, allow outstanding
> > +      * requests to be completed as much as possible.
> > +      */
> > +     RTE_DMA_STATUS_ACTIVE_DROP,
> > +     /** The operation failed to complete due invalid source address */
> > +     RTE_DMA_STATUS_INVALID_SRC_ADDR,
> > +     /** The operation failed to complete due invalid destination address */
> > +     RTE_DMA_STATUS_INVALID_DST_ADDR,
> > +     /** The operation failed to complete due invalid length */
> > +     RTE_DMA_STATUS_INVALID_LENGTH,
> > +     /** The operation failed to complete due invalid opcode
> > +      * The DMA descriptor could have multiple format, which are
> > +      * distinguished by the opcode field.
> > +      */
> > +     RTE_DMA_STATUS_INVALID_OPCODE,
> > +     /** The operation failed to complete due bus err */
> > +     RTE_DMA_STATUS_BUS_ERROR,
> > +     /** The operation failed to complete due data poison */
> > +     RTE_DMA_STATUS_DATA_POISION,
> > +     /** The operation failed to complete due descriptor read error */
> > +     RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR,
> > +     /** The operation failed to complete due device link error
> > +      * Used to indicates that the link error in the mem-to-dev/dev-to-mem/
> > +      * dev-to-dev transfer scenario.
> > +      */
> > +     RTE_DMA_STATUS_DEV_LINK_ERROR,
> > +     /** Driver specific status code offset
> > +      * Start status code for the driver to define its own error code.
> > +      */
> > +     RTE_DMA_STATUS_DRV_SPECIFIC_OFFSET = 0x10000,
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Returns the number of operations that failed to complete.
> > + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param nb_status
> > + *   Indicates the size of status array.
> > + * @param[out] status
> > + *   The error code of operations that failed to complete.
> > + *   Some standard error code are described in 'enum rte_dma_status_code'
> > + *   @see rte_dma_status_code
> > + * @param[out] last_idx
> > + *   The last failed completed operation's index.
> > + *
> > + * @return
> > + *   The number of operations that failed to complete.
> > + */
> > +__rte_experimental
> > +static inline uint16_t
> > +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vchan,
> > +                        const uint16_t nb_status, uint32_t *status,
> > +                        uint16_t *last_idx)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +#ifdef RTE_DMADEV_DEBUG
> > +     RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(status, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(last_idx, -EINVAL);
> > +     RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_fails, -ENOTSUP);
> > +     if (vchan >= dev->data->dev_conf.max_vchans) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> > +             return -EINVAL;
> > +     }
> > +     if (nb_status == 0) {
> > +             RTE_DMADEV_LOG(ERR, "Invalid nb_status\n");
> > +             return -EINVAL;
> > +     }
> > +#endif
> > +     return (*dev->completed_fails)(dev, vchan, nb_status, status, last_idx);
> > +}
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_DMADEV_H_ */
> > diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
> > new file mode 100644
> > index 0000000..410faf0
> > --- /dev/null
> > +++ b/lib/dmadev/rte_dmadev_core.h
> > @@ -0,0 +1,159 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2021 HiSilicon Limited.
> > + * Copyright(c) 2021 Intel Corporation.
> > + */
> > +
> > +#ifndef _RTE_DMADEV_CORE_H_
> > +#define _RTE_DMADEV_CORE_H_
> > +
> > +/**
> > + * @file
> > + *
> > + * RTE DMA Device internal header.
> > + *
> > + * This header contains internal data types, that are used by the DMA devices
> > + * in order to expose their ops to the class.
> > + *
> > + * Applications should not use these API directly.
> > + *
> > + */
> > +
> > +struct rte_dmadev;
> > +
> > +/** @internal Used to get device information of a device. */
> > +typedef int (*dmadev_info_get_t)(struct rte_dmadev *dev,
> > +                              struct rte_dmadev_info *dev_info);
>
> First parameter can be "const"
>
> > +/** @internal Used to configure a device. */
> > +typedef int (*dmadev_configure_t)(struct rte_dmadev *dev,
> > +                               const struct rte_dmadev_conf *dev_conf);
> > +
> > +/** @internal Used to start a configured device. */
> > +typedef int (*dmadev_start_t)(struct rte_dmadev *dev);
> > +
> > +/** @internal Used to stop a configured device. */
> > +typedef int (*dmadev_stop_t)(struct rte_dmadev *dev);
> > +
> > +/** @internal Used to close a configured device. */
> > +typedef int (*dmadev_close_t)(struct rte_dmadev *dev);
> > +
> > +/** @internal Used to reset a configured device. */
> > +typedef int (*dmadev_reset_t)(struct rte_dmadev *dev);
> > +
> > +/** @internal Used to allocate and set up a virtual DMA channel. */
> > +typedef int (*dmadev_vchan_setup_t)(struct rte_dmadev *dev,
> > +                                 const struct rte_dmadev_vchan_conf *conf);
> > +
> > +/** @internal Used to release a virtual DMA channel. */
> > +typedef int (*dmadev_vchan_release_t)(struct rte_dmadev *dev, uint16_t vchan);
> > +
> > +/** @internal Used to retrieve basic statistics. */
> > +typedef int (*dmadev_stats_get_t)(struct rte_dmadev *dev, int vchan,
> > +                               struct rte_dmadev_stats *stats);
>
> First parameter can be "const"
>
> > +
> > +/** @internal Used to reset basic statistics. */
> > +typedef int (*dmadev_stats_reset_t)(struct rte_dmadev *dev, int vchan);
> > +
> > +/** @internal Used to dump internal information. */
> > +typedef int (*dmadev_dump_t)(struct rte_dmadev *dev, FILE *f);
> > +
>
> First param "const"
>
> > +/** @internal Used to start dmadev selftest. */
> > +typedef int (*dmadev_selftest_t)(uint16_t dev_id);
> > +
>
> This looks an outlier taking a dev_id. It should take a rawdev parameter.
> Most drivers should not need to implement this anyway, as the main unit
> tests should be in "test_dmadev.c" in the autotest app.
>
> > +/** @internal Used to enqueue a copy operation. */
> > +typedef int (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vchan,
> > +                          rte_iova_t src, rte_iova_t dst,
> > +                          uint32_t length, uint64_t flags);
> > +
> > +/** @internal Used to enqueue a scatter list copy operation. */
> > +typedef int (*dmadev_copy_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
> > +                             const struct rte_dma_sg *sg,
> > +                             uint32_t sg_len, uint64_t flags);
> > +
> > +/** @internal Used to enqueue a fill operation. */
> > +typedef int (*dmadev_fill_t)(struct rte_dmadev *dev, uint16_t vchan,
> > +                          uint64_t pattern, rte_iova_t dst,
> > +                          uint32_t length, uint64_t flags);
> > +
> > +/** @internal Used to enqueue a scatter list fill operation. */
> > +typedef int (*dmadev_fill_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
> > +                     uint64_t pattern, const struct rte_dma_sg *sg,
> > +                     uint32_t sg_len, uint64_t flags);
> > +
> > +/** @internal Used to trigger hardware to begin working. */
> > +typedef int (*dmadev_submit_t)(struct rte_dmadev *dev, uint16_t vchan);
> > +
> > +/** @internal Used to return number of successful completed operations. */
> > +typedef uint16_t (*dmadev_completed_t)(struct rte_dmadev *dev, uint16_t vchan,
> > +                                    const uint16_t nb_cpls,
> > +                                    uint16_t *last_idx, bool *has_error);
> > +
> > +/** @internal Used to return number of failed completed operations. */
> > +typedef uint16_t (*dmadev_completed_fails_t)(struct rte_dmadev *dev,
> > +                     uint16_t vchan, const uint16_t nb_status,
> > +                     uint32_t *status, uint16_t *last_idx);
> > +
> > +/**
> > + * DMA device operations function pointer table
> > + */
> > +struct rte_dmadev_ops {
> > +     dmadev_info_get_t dev_info_get;
> > +     dmadev_configure_t dev_configure;
> > +     dmadev_start_t dev_start;
> > +     dmadev_stop_t dev_stop;
> > +     dmadev_close_t dev_close;
> > +     dmadev_reset_t dev_reset;
> > +     dmadev_vchan_setup_t vchan_setup;
> > +     dmadev_vchan_release_t vchan_release;
> > +     dmadev_stats_get_t stats_get;
> > +     dmadev_stats_reset_t stats_reset;
> > +     dmadev_dump_t dev_dump;
> > +     dmadev_selftest_t dev_selftest;
> > +};
> > +
> > +/**
> > + * @internal
> > + * The data part, with no function pointers, associated with each DMA device.
> > + *
> > + * This structure is safe to place in shared memory to be common among different
> > + * processes in a multi-process configuration.
> > + */
> > +struct rte_dmadev_data {
> > +     uint16_t dev_id; /**< Device [external] identifier. */
> > +     char dev_name[RTE_DMADEV_NAME_MAX_LEN]; /**< Unique identifier name */
> > +     void *dev_private; /**< PMD-specific private data. */
> > +     struct rte_dmadev_conf dev_conf; /**< DMA device configuration. */
> > +     uint8_t dev_started : 1; /**< Device state: STARTED(1)/STOPPED(0). */
> > +     uint64_t reserved[4]; /**< Reserved for future fields */
> > +} __rte_cache_aligned;
> > +
>
> While I generally don't like having reserved space, this is one place where
> it makes sense, so +1 for it here.
>
> > +/**
> > + * @internal
> > + * The generic data structure associated with each DMA device.
> > + *
> > + * The dataplane APIs are located at the beginning of the structure, along
> > + * with the pointer to where all the data elements for the particular device
> > + * are stored in shared memory. This split scheme allows the function pointer
> > + * and driver data to be per-process, while the actual configuration data for
> > + * the device is shared.
> > + */
> > +struct rte_dmadev {
> > +     dmadev_copy_t copy;
> > +     dmadev_copy_sg_t copy_sg;
> > +     dmadev_fill_t fill;
> > +     dmadev_fill_sg_t fill_sg;
> > +     dmadev_submit_t submit;
> > +     dmadev_completed_t completed;
> > +     dmadev_completed_fails_t completed_fails;
> > +     const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD. */
> > +     /** Flag indicating the device is attached: ATTACHED(1)/DETACHED(0). */
> > +     uint8_t attached : 1;
>
> Since it's in the midst of a series of pointers, this 1-bit flag is
> actually using 8-bytes of space. Is it needed. Can we use dev_ops == NULL
> or data == NULL instead to indicate this is a valid entry?
>
> > +     /** Device info which supplied during device initialization. */
> > +     struct rte_device *device;
> > +     struct rte_dmadev_data *data; /**< Pointer to device data. */
>
> If we are to try and minimise cacheline access, we should put this data
> pointer - or even better a copy of data->private pointer - at the top of
> the structure on the same cacheline as datapath operations. For dataplane,
> I can't see any elements of data, except the private pointer being
> accessed, so we would probably get most benefit for having a copy put there
> on init of the dmadev struct.
>
> > +     uint64_t reserved[4]; /**< Reserved for future fields */
> > +} __rte_cache_aligned;
> > +
> > +extern struct rte_dmadev rte_dmadevices[];
> > +
> > +#endif /* _RTE_DMADEV_CORE_H_ */
> > diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h
> > new file mode 100644
> > index 0000000..45141f9
> > --- /dev/null
> > +++ b/lib/dmadev/rte_dmadev_pmd.h
> > @@ -0,0 +1,72 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2021 HiSilicon Limited.
> > + */
> > +
> > +#ifndef _RTE_DMADEV_PMD_H_
> > +#define _RTE_DMADEV_PMD_H_
> > +
> > +/**
> > + * @file
> > + *
> > + * RTE DMA Device PMD APIs
> > + *
> > + * Driver facing APIs for a DMA device. These are not to be called directly by
> > + * any application.
> > + */
> > +
> > +#include "rte_dmadev.h"
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +/**
> > + * @internal
> > + * Allocates a new dmadev slot for an DMA device and returns the pointer
> > + * to that slot for the driver to use.
> > + *
> > + * @param name
> > + *   DMA device name.
> > + *
> > + * @return
> > + *   A pointer to the DMA device slot case of success,
> > + *   NULL otherwise.
> > + */
> > +__rte_internal
> > +struct rte_dmadev *
> > +rte_dmadev_pmd_allocate(const char *name);
> > +
> > +/**
> > + * @internal
> > + * Release the specified dmadev.
> > + *
> > + * @param dev
> > + *   Device to be released.
> > + *
> > + * @return
> > + *   - 0 on success, negative on error
> > + */
> > +__rte_internal
> > +int
> > +rte_dmadev_pmd_release(struct rte_dmadev *dev);
> > +
> > +/**
> > + * @internal
> > + * Return the DMA device based on the device name.
> > + *
> > + * @param name
> > + *   DMA device name.
> > + *
> > + * @return
> > + *   A pointer to the DMA device slot case of success,
> > + *   NULL otherwise.
> > + */
> > +__rte_internal
> > +struct rte_dmadev *
> > +rte_dmadev_get_device_by_name(const char *name);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_DMADEV_PMD_H_ */
> > diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map
> > new file mode 100644
> > index 0000000..0f099e7
> > --- /dev/null
> > +++ b/lib/dmadev/version.map
> > @@ -0,0 +1,40 @@
> > +EXPERIMENTAL {
> > +     global:
> > +
> > +     rte_dmadev_count;
> > +     rte_dmadev_info_get;
> > +     rte_dmadev_configure;
> > +     rte_dmadev_start;
> > +     rte_dmadev_stop;
> > +     rte_dmadev_close;
> > +     rte_dmadev_reset;
> > +     rte_dmadev_vchan_setup;
> > +     rte_dmadev_vchan_release;
> > +     rte_dmadev_stats_get;
> > +     rte_dmadev_stats_reset;
> > +     rte_dmadev_dump;
> > +     rte_dmadev_selftest;
> > +     rte_dmadev_copy;
> > +     rte_dmadev_copy_sg;
> > +     rte_dmadev_fill;
> > +     rte_dmadev_fill_sg;
> > +     rte_dmadev_submit;
> > +     rte_dmadev_completed;
> > +     rte_dmadev_completed_fails;
> > +
> > +     local: *;
> > +};
>
> The elements in the version.map file blocks should be sorted alphabetically.
>
> > +
> > +INTERNAL {
> > +        global:
> > +
> > +     rte_dmadevices;
> > +     rte_dmadev_pmd_allocate;
> > +     rte_dmadev_pmd_release;
> > +     rte_dmadev_get_device_by_name;
> > +
> > +     local:
> > +
> > +     rte_dmadev_is_valid_dev;
> > +};
> > +
> > diff --git a/lib/meson.build b/lib/meson.build
> > index 1673ca4..68d239f 100644
> > --- a/lib/meson.build
> > +++ b/lib/meson.build
> > @@ -60,6 +60,7 @@ libraries = [
> >          'bpf',
> >          'graph',
> >          'node',
> > +        'dmadev',
> >  ]
> >
> >  if is_windows
> > --
> > 2.8.1
> >

Ananyev, Konstantin July 13, 2021, 2:19 p.m. UTC | #14

> +#include "rte_dmadev_core.h"
> +
> +/**
> + *  DMA flags to augment operation preparation.
> + *  Used as the 'flags' parameter of rte_dmadev_copy/copy_sg/fill/fill_sg.
> + */
> +#define RTE_DMA_FLAG_FENCE	(1ull << 0)
> +/**< DMA fence flag
> + * It means the operation with this flag must be processed only after all
> + * previous operations are completed.
> + *
> + * @see rte_dmadev_copy()
> + * @see rte_dmadev_copy_sg()
> + * @see rte_dmadev_fill()
> + * @see rte_dmadev_fill_sg()
> + */
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a copy operation onto the virtual DMA channel.
> + *
> + * This queues up a copy operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vchan
> + *   The identifier of virtual DMA channel.
> + * @param src
> + *   The address of the source buffer.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the data to be copied.
> + * @param flags
> + *   An flags for this operation.
> + *
> + * @return
> + *   - 0..UINT16_MAX: index of enqueued copy job.
> + *   - <0: Error code returned by the driver copy function.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
> +		uint32_t length, uint64_t flags)
> +{
> +	struct rte_dmadev *dev = &rte_dmadevices[dev_id];

One question I have - did you guys consider hiding definitions of struct rte_dmadev 
and  rte_dmadevices[] into .c straight from the start?
Probably no point to repeat our famous ABI ethdev/cryptodev/... pitfalls here.  

> +#ifdef RTE_DMADEV_DEBUG
> +	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy, -ENOTSUP);
> +	if (vchan >= dev->data->dev_conf.max_vchans) {
> +		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
> +		return -EINVAL;
> +	}
> +#endif
> +	return (*dev->copy)(dev, vchan, src, dst, length, flags);
> +}
> +
> +/**

Bruce Richardson July 13, 2021, 2:28 p.m. UTC | #15

On Tue, Jul 13, 2021 at 03:19:39PM +0100, Ananyev, Konstantin wrote:
> 
> > +#include "rte_dmadev_core.h"
> > +
> > +/**
> > + *  DMA flags to augment operation preparation.
> > + *  Used as the 'flags' parameter of rte_dmadev_copy/copy_sg/fill/fill_sg.
> > + */
> > +#define RTE_DMA_FLAG_FENCE   (1ull << 0)
> > +/**< DMA fence flag
> > + * It means the operation with this flag must be processed only after all
> > + * previous operations are completed.
> > + *
> > + * @see rte_dmadev_copy()
> > + * @see rte_dmadev_copy_sg()
> > + * @see rte_dmadev_fill()
> > + * @see rte_dmadev_fill_sg()
> > + */
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a copy operation onto the virtual DMA channel.
> > + *
> > + * This queues up a copy operation to be performed by hardware, but does not
> > + * trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vchan
> > + *   The identifier of virtual DMA channel.
> > + * @param src
> > + *   The address of the source buffer.
> > + * @param dst
> > + *   The address of the destination buffer.
> > + * @param length
> > + *   The length of the data to be copied.
> > + * @param flags
> > + *   An flags for this operation.
> > + *
> > + * @return
> > + *   - 0..UINT16_MAX: index of enqueued copy job.
> > + *   - <0: Error code returned by the driver copy function.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
> > +             uint32_t length, uint64_t flags)
> > +{
> > +     struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> 
> One question I have - did you guys consider hiding definitions of struct rte_dmadev
> and  rte_dmadevices[] into .c straight from the start?
> Probably no point to repeat our famous ABI ethdev/cryptodev/... pitfalls here.
> 
I considered it, but I found even moving one operation (the doorbell one)
to be non-inline made a small but noticable perf drop. Until we get all the
drivers done and more testing in various scenarios, I'd rather err on the
side of getting the best performance.

diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 4347555..0595239 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -496,6 +496,10 @@  F: drivers/raw/skeleton/
 F: app/test/test_rawdev.c
 F: doc/guides/prog_guide/rawdev.rst
 
+DMA device API - EXPERIMENTAL
+M: Chengwen Feng <fengchengwen@huawei.com>
+F: lib/dmadev/
+
 
 Memory Pool Drivers
 -------------------
diff --git a/config/rte_config.h b/config/rte_config.h
index 590903c..331a431 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -81,6 +81,9 @@ 
 /* rawdev defines */
 #define RTE_RAWDEV_MAX_DEVS 64
 
+/* dmadev defines */
+#define RTE_DMADEV_MAX_DEVS 64
+
 /* ip_fragmentation defines */
 #define RTE_LIBRTE_IP_FRAG_MAX_FRAG 4
 #undef RTE_LIBRTE_IP_FRAG_TBL_STAT
diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build
new file mode 100644
index 0000000..c918dae
--- /dev/null
+++ b/lib/dmadev/meson.build
@@ -0,0 +1,6 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 HiSilicon Limited.
+
+sources = files('rte_dmadev.c')
+headers = files('rte_dmadev.h', 'rte_dmadev_pmd.h')
+indirect_headers += files('rte_dmadev_core.h')
diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c
new file mode 100644
index 0000000..8a29abb
--- /dev/null
+++ b/lib/dmadev/rte_dmadev.c
@@ -0,0 +1,560 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ */
+
+#include <ctype.h>
+#include <inttypes.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_debug.h>
+#include <rte_dev.h>
+#include <rte_eal.h>
+#include <rte_errno.h>
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memzone.h>
+#include <rte_malloc.h>
+#include <rte_string_fns.h>
+
+#include "rte_dmadev.h"
+#include "rte_dmadev_pmd.h"
+
+RTE_LOG_REGISTER(rte_dmadev_logtype, lib.dmadev, INFO);
+
+struct rte_dmadev rte_dmadevices[RTE_DMADEV_MAX_DEVS];
+
+static const char *MZ_RTE_DMADEV_DATA = "rte_dmadev_data";
+/* Shared memory between primary and secondary processes. */
+static struct {
+	struct rte_dmadev_data data[RTE_DMADEV_MAX_DEVS];
+} *dmadev_shared_data;
+
+static int
+dmadev_check_name(const char *name)
+{
+	size_t name_len;
+
+	if (name == NULL) {
+		RTE_DMADEV_LOG(ERR, "Name can't be NULL\n");
+		return -EINVAL;
+	}
+
+	name_len = strnlen(name, RTE_DMADEV_NAME_MAX_LEN);
+	if (name_len == 0) {
+		RTE_DMADEV_LOG(ERR, "Zero length DMA device name\n");
+		return -EINVAL;
+	}
+	if (name_len >= RTE_DMADEV_NAME_MAX_LEN) {
+		RTE_DMADEV_LOG(ERR, "DMA device name is too long\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static uint16_t
+dmadev_find_free_dev(void)
+{
+	uint16_t i;
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if (dmadev_shared_data->data[i].dev_name[0] == '\0') {
+			RTE_ASSERT(rte_dmadevices[i].attached == 0);
+			return i;
+		}
+	}
+
+	return RTE_DMADEV_MAX_DEVS;
+}
+
+static struct rte_dmadev*
+dmadev_allocated(const char *name)
+{
+	uint16_t i;
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if ((rte_dmadevices[i].attached == 1) &&
+		    (!strcmp(name, rte_dmadevices[i].data->dev_name)))
+			return &rte_dmadevices[i];
+	}
+
+	return NULL;
+}
+
+static int
+dmadev_shared_data_prepare(void)
+{
+	const struct rte_memzone *mz;
+
+	if (dmadev_shared_data == NULL) {
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			/* Allocate port data and ownership shared memory. */
+			mz = rte_memzone_reserve(MZ_RTE_DMADEV_DATA,
+					 sizeof(*dmadev_shared_data),
+					 rte_socket_id(), 0);
+		} else {
+			mz = rte_memzone_lookup(MZ_RTE_DMADEV_DATA);
+		}
+		if (mz == NULL)
+			return -ENOMEM;
+
+		dmadev_shared_data = mz->addr;
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+			memset(dmadev_shared_data->data, 0,
+			       sizeof(dmadev_shared_data->data));
+	}
+
+	return 0;
+}
+
+static struct rte_dmadev *
+dmadev_allocate(const char *name)
+{
+	struct rte_dmadev *dev;
+	uint16_t dev_id;
+
+	dev = dmadev_allocated(name);
+	if (dev != NULL) {
+		RTE_DMADEV_LOG(ERR, "DMA device already allocated\n");
+		return NULL;
+	}
+
+	dev_id = dmadev_find_free_dev();
+	if (dev_id == RTE_DMADEV_MAX_DEVS) {
+		RTE_DMADEV_LOG(ERR, "Reached maximum number of DMA devices\n");
+		return NULL;
+	}
+
+	if (dmadev_shared_data_prepare() != 0) {
+		RTE_DMADEV_LOG(ERR, "Cannot allocate DMA shared data\n");
+		return NULL;
+	}
+
+	dev = &rte_dmadevices[dev_id];
+	dev->data = &dmadev_shared_data->data[dev_id];
+	dev->data->dev_id = dev_id;
+	strlcpy(dev->data->dev_name, name, sizeof(dev->data->dev_name));
+
+	return dev;
+}
+
+static struct rte_dmadev *
+dmadev_attach_secondary(const char *name)
+{
+	struct rte_dmadev *dev;
+	uint16_t i;
+
+	if (dmadev_shared_data_prepare() != 0) {
+		RTE_DMADEV_LOG(ERR, "Cannot allocate DMA shared data\n");
+		return NULL;
+	}
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if (!strcmp(dmadev_shared_data->data[i].dev_name, name))
+			break;
+	}
+	if (i == RTE_DMADEV_MAX_DEVS) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %s is not driven by the primary process\n",
+			name);
+		return NULL;
+	}
+
+	dev = &rte_dmadevices[i];
+	dev->data = &dmadev_shared_data->data[i];
+	RTE_ASSERT(dev->data->dev_id == i);
+
+	return dev;
+}
+
+struct rte_dmadev *
+rte_dmadev_pmd_allocate(const char *name)
+{
+	struct rte_dmadev *dev;
+
+	if (dmadev_check_name(name) != 0)
+		return NULL;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		dev = dmadev_allocate(name);
+	else
+		dev = dmadev_attach_secondary(name);
+
+	if (dev == NULL)
+		return NULL;
+	dev->attached = 1;
+
+	return dev;
+}
+
+int
+rte_dmadev_pmd_release(struct rte_dmadev *dev)
+{
+	if (dev == NULL)
+		return -EINVAL;
+
+	if (dev->attached == 0)
+		return 0;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		rte_free(dev->data->dev_private);
+		memset(dev->data, 0, sizeof(struct rte_dmadev_data));
+	}
+
+	memset(dev, 0, sizeof(struct rte_dmadev));
+	dev->attached = 0;
+
+	return 0;
+}
+
+struct rte_dmadev *
+rte_dmadev_get_device_by_name(const char *name)
+{
+	if (dmadev_check_name(name) != 0)
+		return NULL;
+	return dmadev_allocated(name);
+}
+
+bool
+rte_dmadev_is_valid_dev(uint16_t dev_id)
+{
+	if (dev_id >= RTE_DMADEV_MAX_DEVS ||
+	    rte_dmadevices[dev_id].attached == 0)
+		return false;
+	return true;
+}
+
+uint16_t
+rte_dmadev_count(void)
+{
+	uint16_t count = 0;
+	uint16_t i;
+
+	for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) {
+		if (rte_dmadevices[i].attached == 1)
+			count++;
+	}
+
+	return count;
+}
+
+int
+rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info)
+{
+	struct rte_dmadev *dev;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(dev_info, -EINVAL);
+
+	dev = &rte_dmadevices[dev_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_info_get, -ENOTSUP);
+	memset(dev_info, 0, sizeof(struct rte_dmadev_info));
+	ret = (*dev->dev_ops->dev_info_get)(dev, dev_info);
+	if (ret != 0)
+		return ret;
+
+	dev_info->device = dev->device;
+
+	return 0;
+}
+
+int
+rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf)
+{
+	struct rte_dmadev_info info;
+	struct rte_dmadev *dev;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(dev_conf, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	ret = rte_dmadev_info_get(dev_id, &info);
+	if (ret != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
+		return -EINVAL;
+	}
+	if (dev_conf->max_vchans > info.max_vchans) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u configure too many vchans\n", dev_id);
+		return -EINVAL;
+	}
+	if (dev_conf->enable_mt_vchan &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MT_VCHAN)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support MT-safe vchan\n", dev_id);
+		return -EINVAL;
+	}
+	if (dev_conf->enable_mt_multi_vchan &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support MT-safe multiple vchan\n",
+			dev_id);
+		return -EINVAL;
+	}
+
+	if (dev->data->dev_started != 0) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u must be stopped to allow configuration\n",
+			dev_id);
+		return -EBUSY;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_configure, -ENOTSUP);
+	ret = (*dev->dev_ops->dev_configure)(dev, dev_conf);
+	if (ret == 0)
+		memcpy(&dev->data->dev_conf, dev_conf, sizeof(*dev_conf));
+
+	return ret;
+}
+
+int
+rte_dmadev_start(uint16_t dev_id)
+{
+	struct rte_dmadev *dev;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	if (dev->data->dev_started != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u already started\n", dev_id);
+		return 0;
+	}
+
+	if (dev->dev_ops->dev_start == NULL)
+		goto mark_started;
+
+	ret = (*dev->dev_ops->dev_start)(dev);
+	if (ret != 0)
+		return ret;
+
+mark_started:
+	dev->data->dev_started = 1;
+	return 0;
+}
+
+int
+rte_dmadev_stop(uint16_t dev_id)
+{
+	struct rte_dmadev *dev;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	if (dev->data->dev_started == 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u already stopped\n", dev_id);
+		return 0;
+	}
+
+	if (dev->dev_ops->dev_stop == NULL)
+		goto mark_stopped;
+
+	ret = (*dev->dev_ops->dev_stop)(dev);
+	if (ret != 0)
+		return ret;
+
+mark_stopped:
+	dev->data->dev_started = 0;
+	return 0;
+}
+
+int
+rte_dmadev_close(uint16_t dev_id)
+{
+	struct rte_dmadev *dev;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	/* Device must be stopped before it can be closed */
+	if (dev->data->dev_started == 1) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u must be stopped before closing\n", dev_id);
+		return -EBUSY;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_close, -ENOTSUP);
+	return (*dev->dev_ops->dev_close)(dev);
+}
+
+int
+rte_dmadev_reset(uint16_t dev_id)
+{
+	struct rte_dmadev *dev;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_reset, -ENOTSUP);
+	/* Reset is not dependent on state of the device */
+	return (*dev->dev_ops->dev_reset)(dev);
+}
+
+int
+rte_dmadev_vchan_setup(uint16_t dev_id,
+		       const struct rte_dmadev_vchan_conf *conf)
+{
+	struct rte_dmadev_info info;
+	struct rte_dmadev *dev;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(conf, -EINVAL);
+
+	dev = &rte_dmadevices[dev_id];
+
+	ret = rte_dmadev_info_get(dev_id, &info);
+	if (ret != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction == 0 ||
+	    conf->direction & ~RTE_DMA_TRANSFER_DIR_ALL) {
+		RTE_DMADEV_LOG(ERR, "Device %u direction invalid!\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_MEM_TO_MEM &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MEM_TO_MEM)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support mem2mem transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_MEM_TO_DEV &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_MEM_TO_DEV)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support mem2dev transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_DEV_TO_MEM &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_DEV_TO_MEM)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support dev2mem transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->direction & RTE_DMA_DEV_TO_DEV &&
+	    !(info.dev_capa & RTE_DMA_DEV_CAPA_DEV_TO_DEV)) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u don't support dev2dev transfer\n", dev_id);
+		return -EINVAL;
+	}
+	if (conf->nb_desc < info.min_desc || conf->nb_desc > info.max_desc) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u number of descriptors invalid\n", dev_id);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vchan_setup, -ENOTSUP);
+	return (*dev->dev_ops->vchan_setup)(dev, conf);
+}
+
+int
+rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan)
+{
+	struct rte_dmadev *dev;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u vchan %u out of range\n", dev_id, vchan);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->vchan_release, -ENOTSUP);
+	return (*dev->dev_ops->vchan_release)(dev, vchan);
+}
+
+int
+rte_dmadev_stats_get(uint16_t dev_id, int vchan, struct rte_dmadev_stats *stats)
+{
+	struct rte_dmadev *dev;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(stats, -EINVAL);
+
+	dev = &rte_dmadevices[dev_id];
+
+	if (vchan < -1 || vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u vchan %u out of range\n", dev_id, vchan);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
+	return (*dev->dev_ops->stats_get)(dev, vchan, stats);
+}
+
+int
+rte_dmadev_stats_reset(uint16_t dev_id, int vchan)
+{
+	struct rte_dmadev *dev;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	if (vchan < -1 || vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR,
+			"Device %u vchan %u out of range\n", dev_id, vchan);
+		return -EINVAL;
+	}
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP);
+	return (*dev->dev_ops->stats_reset)(dev, vchan);
+}
+
+int
+rte_dmadev_dump(uint16_t dev_id, FILE *f)
+{
+	struct rte_dmadev_info info;
+	struct rte_dmadev *dev;
+	int ret;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(f, -EINVAL);
+
+	ret = rte_dmadev_info_get(dev_id, &info);
+	if (ret != 0) {
+		RTE_DMADEV_LOG(ERR, "Device %u get device info fail\n", dev_id);
+		return -EINVAL;
+	}
+
+	dev = &rte_dmadevices[dev_id];
+
+	fprintf(f, "DMA Dev %u, '%s' [%s]\n",
+		dev->data->dev_id,
+		dev->data->dev_name,
+		dev->data->dev_started ? "started" : "stopped");
+	fprintf(f, "  dev_capa: 0x%" PRIx64 "\n", info.dev_capa);
+	fprintf(f, "  max_vchans_supported: %u\n", info.max_vchans);
+	fprintf(f, "  max_vchans_configured: %u\n", info.nb_vchans);
+	fprintf(f, "  MT-safe-configured: vchans: %u multi-vchans: %u\n",
+		dev->data->dev_conf.enable_mt_vchan,
+		dev->data->dev_conf.enable_mt_multi_vchan);
+
+	if (dev->dev_ops->dev_dump != NULL)
+		return (*dev->dev_ops->dev_dump)(dev, f);
+
+	return 0;
+}
+
+int
+rte_dmadev_selftest(uint16_t dev_id)
+{
+	struct rte_dmadev *dev;
+
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_dmadevices[dev_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_selftest, -ENOTSUP);
+	return (*dev->dev_ops->dev_selftest)(dev_id);
+}
diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
new file mode 100644
index 0000000..8779512
--- /dev/null
+++ b/lib/dmadev/rte_dmadev.h
@@ -0,0 +1,1030 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ * Copyright(c) 2021 Marvell International Ltd.
+ */
+
+#ifndef _RTE_DMADEV_H_
+#define _RTE_DMADEV_H_
+
+/**
+ * @file rte_dmadev.h
+ *
+ * RTE DMA (Direct Memory Access) device APIs.
+ *
+ * The DMA framework is built on the following model:
+ *
+ *     ---------------   ---------------       ---------------
+ *     | virtual DMA |   | virtual DMA |       | virtual DMA |
+ *     | channel     |   | channel     |       | channel     |
+ *     ---------------   ---------------       ---------------
+ *            |                |                      |
+ *            ------------------                      |
+ *                     |                              |
+ *               ------------                    ------------
+ *               |  dmadev  |                    |  dmadev  |
+ *               ------------                    ------------
+ *                     |                              |
+ *            ------------------               ------------------
+ *            | HW-DMA-channel |               | HW-DMA-channel |
+ *            ------------------               ------------------
+ *                     |                              |
+ *                     --------------------------------
+ *                                     |
+ *                           ---------------------
+ *                           | HW-DMA-Controller |
+ *                           ---------------------
+ *
+ * The DMA controller could have multilpe HW-DMA-channels (aka. HW-DMA-queues),
+ * each HW-DMA-channel should be represented by a dmadev.
+ *
+ * The dmadev could create multiple virtual DMA channel, each virtual DMA
+ * channel represents a different transfer context. The DMA operation request
+ * must be submitted to the virtual DMA channel.
+ * E.G. Application could create virtual DMA channel 0 for mem-to-mem transfer
+ *      scenario, and create virtual DMA channel 1 for mem-to-dev transfer
+ *      scenario.
+ *
+ * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the
+ * PCI/SoC device probing phase performed at EAL initialization time. And could
+ * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing
+ * phase.
+ *
+ * We use 'uint16_t dev_id' as the device identifier of a dmadev, and
+ * 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev.
+ *
+ * The functions exported by the dmadev API to setup a device designated by its
+ * device identifier must be invoked in the following order:
+ *     - rte_dmadev_configure()
+ *     - rte_dmadev_vchan_setup()
+ *     - rte_dmadev_start()
+ *
+ * Then, the application can invoke dataplane APIs to process jobs.
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_dmadev_configure()), it must call rte_dmadev_stop() first to stop the
+ * device and then do the reconfiguration before calling rte_dmadev_start()
+ * again. The dataplane APIs should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a dmadev by invoking the
+ * rte_dmadev_close() function.
+ *
+ * The dataplane APIs include two parts:
+ *   a) The first part is the submission of operation requests:
+ *        - rte_dmadev_copy()
+ *        - rte_dmadev_copy_sg() - scatter-gather form of copy
+ *        - rte_dmadev_fill()
+ *        - rte_dmadev_fill_sg() - scatter-gather form of fill
+ *        - rte_dmadev_perform() - issue doorbell to hardware
+ *      These APIs could work with different virtual DMA channels which have
+ *      different contexts.
+ *      The first four APIs are used to submit the operation request to the
+ *      virtual DMA channel, if the submission is successful, a uint16_t
+ *      ring_idx is returned, otherwise a negative number is returned.
+ *   b) The second part is to obtain the result of requests:
+ *        - rte_dmadev_completed()
+ *            - return the number of operation requests completed successfully.
+ *        - rte_dmadev_completed_fails()
+ *            - return the number of operation requests failed to complete.
+ *
+ * About the ring_idx which rte_dmadev_copy/copy_sg/fill/fill_sg() returned,
+ * the rules are as follows:
+ *   a) ring_idx for each virtual DMA channel are independent.
+ *   b) For a virtual DMA channel, the ring_idx is monotonically incremented,
+ *      when it reach UINT16_MAX, it wraps back to zero.
+ *   c) The initial ring_idx of a virtual DMA channel is zero, after the device
+ *      is stopped or reset, the ring_idx needs to be reset to zero.
+ *   Example:
+ *      step-1: start one dmadev
+ *      step-2: enqueue a copy operation, the ring_idx return is 0
+ *      step-3: enqueue a copy operation again, the ring_idx return is 1
+ *      ...
+ *      step-101: stop the dmadev
+ *      step-102: start the dmadev
+ *      step-103: enqueue a copy operation, the cookie return is 0
+ *      ...
+ *      step-x+0: enqueue a fill operation, the ring_idx return is 65535
+ *      step-x+1: enqueue a copy operation, the ring_idx return is 0
+ *      ...
+ *
+ * By default, all the non-dataplane functions of the dmadev API exported by a
+ * PMD are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object.
+ *
+ * The dataplane functions of the dmadev API exported by a PMD can be MT-safe
+ * only when supported by the driver, generally, the driver will reports two
+ * capabilities:
+ *   a) Whether to support MT-safe for the submit/completion API of the same
+ *      virtual DMA channel.
+ *      E.G. one thread do submit operation, another thread do completion
+ *           operation.
+ *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VCHAN.
+ *      If driver don't support it, it's up to the application to guarantee
+ *      MT-safe.
+ *   b) Whether to support MT-safe for different virtual DMA channels.
+ *      E.G. one thread do operation on virtual DMA channel 0, another thread
+ *           do operation on virtual DMA channel 1.
+ *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
+ *      If driver don't support it, it's up to the application to guarantee
+ *      MT-safe.
+ *
+ */
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_DMADEV_NAME_MAX_LEN	RTE_DEV_NAME_MAX_LEN
+
+extern int rte_dmadev_logtype;
+
+#define RTE_DMADEV_LOG(level, ...) \
+	rte_log(RTE_LOG_ ## level, rte_dmadev_logtype, "" __VA_ARGS__)
+
+/* Macros to check for valid port */
+#define RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, retval) do { \
+	if (!rte_dmadev_is_valid_dev(dev_id)) { \
+		RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
+		return retval; \
+	} \
+} while (0)
+
+#define RTE_DMADEV_VALID_DEV_ID_OR_RET(dev_id) do { \
+	if (!rte_dmadev_is_valid_dev(dev_id)) { \
+		RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \
+		return; \
+	} \
+} while (0)
+
+/**
+ * @internal
+ * Validate if the DMA device index is a valid attached DMA device.
+ *
+ * @param dev_id
+ *   DMA device index.
+ *
+ * @return
+ *   - If the device index is valid (true) or not (false).
+ */
+__rte_internal
+bool
+rte_dmadev_is_valid_dev(uint16_t dev_id);
+
+/**
+ * rte_dma_sg - can hold scatter DMA operation request
+ */
+struct rte_dma_sg {
+	rte_iova_t src;
+	rte_iova_t dst;
+	uint32_t length;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of DMA devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable DMA devices.
+ */
+__rte_experimental
+uint16_t
+rte_dmadev_count(void);
+
+/**
+ * The capabilities of a DMA device
+ */
+#define RTE_DMA_DEV_CAPA_MEM_TO_MEM	(1ull << 0)
+/**< DMA device support mem-to-mem transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_MEM_TO_DEV	(1ull << 1)
+/**< DMA device support slave mode & mem-to-dev transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_DEV_TO_MEM	(1ull << 2)
+/**< DMA device support slave mode & dev-to-mem transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_DEV_TO_DEV	(1ull << 3)
+/**< DMA device support slave mode & dev-to-dev transfer.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_OPS_COPY	(1ull << 4)
+/**< DMA device support copy ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_OPS_FILL	(1ull << 5)
+/**< DMA device support fill ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_OPS_SG		(1ull << 6)
+/**< DMA device support scatter-list ops.
+ * If device support ops_copy and ops_sg, it means supporting copy_sg ops.
+ * If device support ops_fill and ops_sg, it means supporting fill_sg ops.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_FENCE		(1ull << 7)
+/**< DMA device support fence.
+ * If device support fence, then application could set a fence flags when
+ * enqueue operation by rte_dma_copy/copy_sg/fill/fill_sg.
+ * If a operation has a fence flags, it means the operation must be processed
+ * only after all previous operations are completed.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_SVA		(1ull << 8)
+/**< DMA device support SVA which could use VA as DMA address.
+ * If device support SVA then application could pass any VA address like memory
+ * from rte_malloc(), rte_memzone(), malloc, stack memory.
+ * If device don't support SVA, then application should pass IOVA address which
+ * from rte_malloc(), rte_memzone().
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_MT_VCHAN	(1ull << 9)
+/**< DMA device support MT-safe of a virtual DMA channel.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+#define RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN	(1ull << 10)
+/**< DMA device support MT-safe of different virtual DMA channels.
+ *
+ * @see struct rte_dmadev_info::dev_capa
+ */
+
+/**
+ * A structure used to retrieve the contextual information of
+ * an DMA device
+ */
+struct rte_dmadev_info {
+	struct rte_device *device; /**< Generic Device information */
+	uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
+	/** Maximum number of virtual DMA channels supported */
+	uint16_t max_vchans;
+	/** Maximum allowed number of virtual DMA channel descriptors */
+	uint16_t max_desc;
+	/** Minimum allowed number of virtual DMA channel descriptors */
+	uint16_t min_desc;
+	uint16_t nb_vchans; /**< Number of virtual DMA channel configured */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the contextual information of a DMA device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_dmadev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - =0: Success, driver updates the contextual information of the DMA device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_dmadev_info_get(uint16_t dev_id, struct rte_dmadev_info *dev_info);
+
+/**
+ * A structure used to configure a DMA device.
+ */
+struct rte_dmadev_conf {
+	/** Maximum number of virtual DMA channel to use.
+	 * This value cannot be greater than the field 'max_vchans' of struct
+	 * rte_dmadev_info which get from rte_dmadev_info_get().
+	 */
+	uint16_t max_vchans;
+	/** Enable bit for MT-safe of a virtual DMA channel.
+	 * This bit can be enabled only when the device supports
+	 * RTE_DMA_DEV_CAPA_MT_VCHAN.
+	 * @see RTE_DMA_DEV_CAPA_MT_VCHAN
+	 */
+	uint8_t enable_mt_vchan : 1;
+	/** Enable bit for MT-safe of different virtual DMA channels.
+	 * This bit can be enabled only when the device supports
+	 * RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN.
+	 * @see RTE_DMA_DEV_CAPA_MT_MULTI_VCHAN
+	 */
+	uint8_t enable_mt_multi_vchan : 1;
+	uint64_t reserved[2]; /**< Reserved for future fields */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a DMA device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param dev_conf
+ *   The DMA device configuration structure encapsulated into rte_dmadev_conf
+ *   object.
+ *
+ * @return
+ *   - =0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_dmadev_configure(uint16_t dev_id, const struct rte_dmadev_conf *dev_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a DMA device.
+ *
+ * The device start step is the last one and consists of setting the DMA
+ * to start accepting jobs.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - =0: Success, device started.
+ *   - <0: Error code returned by the driver start function.
+ */
+__rte_experimental
+int
+rte_dmadev_start(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a DMA device.
+ *
+ * The device can be restarted with a call to rte_dmadev_start()
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - =0: Success, device stopped.
+ *   - <0: Error code returned by the driver stop function.
+ */
+__rte_experimental
+int
+rte_dmadev_stop(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a DMA device.
+ *
+ * The device cannot be restarted after this call.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *  - =0: Successfully close device
+ *  - <0: Failure to close device
+ */
+__rte_experimental
+int
+rte_dmadev_close(uint16_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset a DMA device.
+ *
+ * This is different from cycle of rte_dmadev_start->rte_dmadev_stop in the
+ * sense similar to hard or soft reset.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - =0: Successfully reset device.
+ *   - <0: Failure to reset device.
+ *   - (-ENOTSUP): If the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_dmadev_reset(uint16_t dev_id);
+
+/**
+ * DMA transfer direction defines.
+ */
+#define RTE_DMA_MEM_TO_MEM	(1ull << 0)
+/**< DMA transfer direction - from memory to memory.
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+#define RTE_DMA_MEM_TO_DEV	(1ull << 1)
+/**< DMA transfer direction - slave mode & from memory to device.
+ * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
+ * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
+ * request from ARM memory to x86 host memory.
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+#define RTE_DMA_DEV_TO_MEM	(1ull << 2)
+/**< DMA transfer direction - slave mode & from device to memory.
+ * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
+ * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
+ * request from x86 host memory to ARM memory.
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+#define RTE_DMA_DEV_TO_DEV	(1ull << 3)
+/**< DMA transfer direction - slave mode & from device to device.
+ * In a typical scenario, ARM SoCs are installed on x86 servers as iNICs. In
+ * this case, the ARM SoCs works in slave mode, it could initiate a DMA move
+ * request from x86 host memory to another x86 host memory.
+ *
+ * @see struct rte_dmadev_vchan_conf::direction
+ */
+#define RTE_DMA_TRANSFER_DIR_ALL	(RTE_DMA_MEM_TO_MEM | \
+					 RTE_DMA_MEM_TO_DEV | \
+					 RTE_DMA_DEV_TO_MEM | \
+					 RTE_DMA_DEV_TO_DEV)
+
+/**
+ * enum rte_dma_slave_port_type - slave mode type defines
+ */
+enum rte_dma_slave_port_type {
+	/** The slave port is PCIE. */
+	RTE_DMA_SLAVE_PORT_PCIE = 1,
+};
+
+/**
+ * A structure used to descript slave port parameters.
+ */
+struct rte_dma_slave_port_parameters {
+	enum rte_dma_slave_port_type port_type;
+	union {
+		/** For PCIE port */
+		struct {
+			/** The physical function number which to use */
+			uint64_t pf_number : 6;
+			/** Virtual function enable bit */
+			uint64_t vf_enable : 1;
+			/** The virtual function number which to use */
+			uint64_t vf_number : 8;
+			uint64_t pasid : 20;
+			/** The attributes filed in TLP packet */
+			uint64_t tlp_attr : 3;
+		};
+	};
+};
+
+/**
+ * A structure used to configure a virtual DMA channel.
+ */
+struct rte_dmadev_vchan_conf {
+	uint8_t direction; /**< Set of supported transfer directions */
+	/** Number of descriptor for the virtual DMA channel */
+	uint16_t nb_desc;
+	/** 1) Used to describes the dev parameter in the mem-to-dev/dev-to-mem
+	 * transfer scenario.
+	 * 2) Used to describes the src dev parameter in the dev-to-dev
+	 * transfer scenario.
+	 */
+	struct rte_dma_slave_port_parameters port;
+	/** Used to describes the dst dev parameters in the dev-to-dev
+	 * transfer scenario.
+	 */
+	struct rte_dma_slave_port_parameters peer_port;
+	uint64_t reserved[2]; /**< Reserved for future fields */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a virtual DMA channel.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param conf
+ *   The virtual DMA channel configuration structure encapsulated into
+ *   rte_dmadev_vchan_conf object.
+ *
+ * @return
+ *   - >=0: Allocate success, it is the virtual DMA channel id. This value must
+ *          be less than the field 'max_vchans' of struct rte_dmadev_conf
+	    which configured by rte_dmadev_configure().
+ *   - <0: Error code returned by the driver virtual channel setup function.
+ */
+__rte_experimental
+int
+rte_dmadev_vchan_setup(uint16_t dev_id,
+		       const struct rte_dmadev_vchan_conf *conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Release a virtual DMA channel.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel which return by vchan setup.
+ *
+ * @return
+ *   - =0: Successfully release the virtual DMA channel.
+ *   - <0: Error code returned by the driver virtual channel release function.
+ */
+__rte_experimental
+int
+rte_dmadev_vchan_release(uint16_t dev_id, uint16_t vchan);
+
+/**
+ * rte_dmadev_stats - running statistics.
+ */
+struct rte_dmadev_stats {
+	/** Count of operations which were successfully enqueued */
+	uint64_t enqueued_count;
+	/** Count of operations which were submitted to hardware */
+	uint64_t submitted_count;
+	/** Count of operations which failed to complete */
+	uint64_t completed_fail_count;
+	/** Count of operations which successfully complete */
+	uint64_t completed_count;
+	uint64_t reserved[4]; /**< Reserved for future fields */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve basic statistics of a or all virtual DMA channel(s).
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel, -1 means all channels.
+ * @param[out] stats
+ *   The basic statistics structure encapsulated into rte_dmadev_stats
+ *   object.
+ *
+ * @return
+ *   - =0: Successfully retrieve stats.
+ *   - <0: Failure to retrieve stats.
+ */
+__rte_experimental
+int
+rte_dmadev_stats_get(uint16_t dev_id, int vchan,
+		     struct rte_dmadev_stats *stats);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset basic statistics of a or all virtual DMA channel(s).
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel, -1 means all channels.
+ *
+ * @return
+ *   - =0: Successfully reset stats.
+ *   - <0: Failure to reset stats.
+ */
+__rte_experimental
+int
+rte_dmadev_stats_reset(uint16_t dev_id, int vchan);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump DMA device info.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   The file to write the output to.
+ *
+ * @return
+ *   0 on success. Non-zero otherwise.
+ */
+__rte_experimental
+int
+rte_dmadev_dump(uint16_t dev_id, FILE *f);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the dmadev self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_dmadev_selftest(uint16_t dev_id);
+
+#include "rte_dmadev_core.h"
+
+/**
+ *  DMA flags to augment operation preparation.
+ *  Used as the 'flags' parameter of rte_dmadev_copy/copy_sg/fill/fill_sg.
+ */
+#define RTE_DMA_FLAG_FENCE	(1ull << 0)
+/**< DMA fence flag
+ * It means the operation with this flag must be processed only after all
+ * previous operations are completed.
+ *
+ * @see rte_dmadev_copy()
+ * @see rte_dmadev_copy_sg()
+ * @see rte_dmadev_fill()
+ * @see rte_dmadev_fill_sg()
+ */
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a copy operation onto the virtual DMA channel.
+ *
+ * This queues up a copy operation to be performed by hardware, but does not
+ * trigger hardware to begin that operation.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param src
+ *   The address of the source buffer.
+ * @param dst
+ *   The address of the destination buffer.
+ * @param length
+ *   The length of the data to be copied.
+ * @param flags
+ *   An flags for this operation.
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued copy job.
+ *   - <0: Error code returned by the driver copy function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_copy(uint16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
+		uint32_t length, uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+#endif
+	return (*dev->copy)(dev, vchan, src, dst, length, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a scatter list copy operation onto the virtual DMA channel.
+ *
+ * This queues up a scatter list copy operation to be performed by hardware,
+ * but does not trigger hardware to begin that operation.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param sg
+ *   The pointer of scatterlist.
+ * @param sg_len
+ *   The number of scatterlist elements.
+ * @param flags
+ *   An flags for this operation.
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued copy job.
+ *   - <0: Error code returned by the driver copy function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vchan, const struct rte_dma_sg *sg,
+		   uint32_t sg_len, uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(sg, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->copy_sg, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+#endif
+	return (*dev->copy_sg)(dev, vchan, sg, sg_len, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a fill operation onto the virtual DMA channel.
+ *
+ * This queues up a fill operation to be performed by hardware, but does not
+ * trigger hardware to begin that operation.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param pattern
+ *   The pattern to populate the destination buffer with.
+ * @param dst
+ *   The address of the destination buffer.
+ * @param length
+ *   The length of the destination buffer.
+ * @param flags
+ *   An flags for this operation.
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued copy job.
+ *   - <0: Error code returned by the driver copy function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_fill(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
+		rte_iova_t dst, uint32_t length, uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+#endif
+	return (*dev->fill)(dev, vchan, pattern, dst, length, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a scatter list fill operation onto the virtual DMA channel.
+ *
+ * This queues up a scatter list fill operation to be performed by hardware,
+ * but does not trigger hardware to begin that operation.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param pattern
+ *   The pattern to populate the destination buffer with.
+ * @param sg
+ *   The pointer of scatterlist.
+ * @param sg_len
+ *   The number of scatterlist elements.
+ * @param flags
+ *   An flags for this operation.
+ *
+ * @return
+ *   - 0..UINT16_MAX: index of enqueued copy job.
+ *   - <0: Error code returned by the driver copy function.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_fill_sg(uint16_t dev_id, uint16_t vchan, uint64_t pattern,
+		   const struct rte_dma_sg *sg, uint32_t sg_len,
+		   uint64_t flags)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(sg, -ENOTSUP);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->fill, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+#endif
+	return (*dev->fill_sg)(dev, vchan, pattern, sg, sg_len, flags);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger hardware to begin performing enqueued operations.
+ *
+ * This API is used to write the "doorbell" to the hardware to trigger it
+ * to begin the operations previously enqueued by rte_dmadev_copy/fill()
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ *
+ * @return
+ *   - =0: Successfully trigger hardware.
+ *   - <0: Failure to trigger hardware.
+ */
+__rte_experimental
+static inline int
+rte_dmadev_submit(uint16_t dev_id, uint16_t vchan)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->submit, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+#endif
+	return (*dev->submit)(dev, vchan);
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of operations that have been successfully completed.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param nb_cpls
+ *   The maximum number of completed operations that can be processed.
+ * @param[out] last_idx
+ *   The last completed operation's index.
+ *   If not required, NULL can be passed in.
+ * @param[out] has_error
+ *   Indicates if there are transfer error.
+ *   If not required, NULL can be passed in.
+ *
+ * @return
+ *   The number of operations that successfully completed.
+ */
+__rte_experimental
+static inline uint16_t
+rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, const uint16_t nb_cpls,
+		     uint16_t *last_idx, bool *has_error)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+	uint16_t idx;
+	bool err;
+
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->completed, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+	if (nb_cpls == 0) {
+		RTE_DMADEV_LOG(ERR, "Invalid nb_cpls\n");
+		return -EINVAL;
+	}
+#endif
+
+	/* Ensure the pointer values are non-null to simplify drivers.
+	 * In most cases these should be compile time evaluated, since this is
+	 * an inline function.
+	 * - If NULL is explicitly passed as parameter, then compiler knows the
+	 *   value is NULL
+	 * - If address of local variable is passed as parameter, then compiler
+	 *   can know it's non-NULL.
+	 */
+	if (last_idx == NULL)
+		last_idx = &idx;
+	if (has_error == NULL)
+		has_error = &err;
+
+	*has_error = false;
+	return (*dev->completed)(dev, vchan, nb_cpls, last_idx, has_error);
+}
+
+/**
+ * DMA transfer status code defines
+ */
+enum rte_dma_status_code {
+	/** The operation completed successfully */
+	RTE_DMA_STATUS_SUCCESSFUL = 0,
+	/** The operation failed to complete due active drop
+	 * This is mainly used when processing dev_stop, allow outstanding
+	 * requests to be completed as much as possible.
+	 */
+	RTE_DMA_STATUS_ACTIVE_DROP,
+	/** The operation failed to complete due invalid source address */
+	RTE_DMA_STATUS_INVALID_SRC_ADDR,
+	/** The operation failed to complete due invalid destination address */
+	RTE_DMA_STATUS_INVALID_DST_ADDR,
+	/** The operation failed to complete due invalid length */
+	RTE_DMA_STATUS_INVALID_LENGTH,
+	/** The operation failed to complete due invalid opcode
+	 * The DMA descriptor could have multiple format, which are
+	 * distinguished by the opcode field.
+	 */
+	RTE_DMA_STATUS_INVALID_OPCODE,
+	/** The operation failed to complete due bus err */
+	RTE_DMA_STATUS_BUS_ERROR,
+	/** The operation failed to complete due data poison */
+	RTE_DMA_STATUS_DATA_POISION,
+	/** The operation failed to complete due descriptor read error */
+	RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR,
+	/** The operation failed to complete due device link error
+	 * Used to indicates that the link error in the mem-to-dev/dev-to-mem/
+	 * dev-to-dev transfer scenario.
+	 */
+	RTE_DMA_STATUS_DEV_LINK_ERROR,
+	/** Driver specific status code offset
+	 * Start status code for the driver to define its own error code.
+	 */
+	RTE_DMA_STATUS_DRV_SPECIFIC_OFFSET = 0x10000,
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Returns the number of operations that failed to complete.
+ * NOTE: This API was used when rte_dmadev_completed has_error was set.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param vchan
+ *   The identifier of virtual DMA channel.
+ * @param nb_status
+ *   Indicates the size of status array.
+ * @param[out] status
+ *   The error code of operations that failed to complete.
+ *   Some standard error code are described in 'enum rte_dma_status_code'
+ *   @see rte_dma_status_code
+ * @param[out] last_idx
+ *   The last failed completed operation's index.
+ *
+ * @return
+ *   The number of operations that failed to complete.
+ */
+__rte_experimental
+static inline uint16_t
+rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vchan,
+			   const uint16_t nb_status, uint32_t *status,
+			   uint16_t *last_idx)
+{
+	struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+#ifdef RTE_DMADEV_DEBUG
+	RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(status, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(last_idx, -EINVAL);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_fails, -ENOTSUP);
+	if (vchan >= dev->data->dev_conf.max_vchans) {
+		RTE_DMADEV_LOG(ERR, "Invalid vchan %d\n", vchan);
+		return -EINVAL;
+	}
+	if (nb_status == 0) {
+		RTE_DMADEV_LOG(ERR, "Invalid nb_status\n");
+		return -EINVAL;
+	}
+#endif
+	return (*dev->completed_fails)(dev, vchan, nb_status, status, last_idx);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DMADEV_H_ */
diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
new file mode 100644
index 0000000..410faf0
--- /dev/null
+++ b/lib/dmadev/rte_dmadev_core.h
@@ -0,0 +1,159 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ * Copyright(c) 2021 Intel Corporation.
+ */
+
+#ifndef _RTE_DMADEV_CORE_H_
+#define _RTE_DMADEV_CORE_H_
+
+/**
+ * @file
+ *
+ * RTE DMA Device internal header.
+ *
+ * This header contains internal data types, that are used by the DMA devices
+ * in order to expose their ops to the class.
+ *
+ * Applications should not use these API directly.
+ *
+ */
+
+struct rte_dmadev;
+
+/** @internal Used to get device information of a device. */
+typedef int (*dmadev_info_get_t)(struct rte_dmadev *dev,
+				 struct rte_dmadev_info *dev_info);
+
+/** @internal Used to configure a device. */
+typedef int (*dmadev_configure_t)(struct rte_dmadev *dev,
+				  const struct rte_dmadev_conf *dev_conf);
+
+/** @internal Used to start a configured device. */
+typedef int (*dmadev_start_t)(struct rte_dmadev *dev);
+
+/** @internal Used to stop a configured device. */
+typedef int (*dmadev_stop_t)(struct rte_dmadev *dev);
+
+/** @internal Used to close a configured device. */
+typedef int (*dmadev_close_t)(struct rte_dmadev *dev);
+
+/** @internal Used to reset a configured device. */
+typedef int (*dmadev_reset_t)(struct rte_dmadev *dev);
+
+/** @internal Used to allocate and set up a virtual DMA channel. */
+typedef int (*dmadev_vchan_setup_t)(struct rte_dmadev *dev,
+				    const struct rte_dmadev_vchan_conf *conf);
+
+/** @internal Used to release a virtual DMA channel. */
+typedef int (*dmadev_vchan_release_t)(struct rte_dmadev *dev, uint16_t vchan);
+
+/** @internal Used to retrieve basic statistics. */
+typedef int (*dmadev_stats_get_t)(struct rte_dmadev *dev, int vchan,
+				  struct rte_dmadev_stats *stats);
+
+/** @internal Used to reset basic statistics. */
+typedef int (*dmadev_stats_reset_t)(struct rte_dmadev *dev, int vchan);
+
+/** @internal Used to dump internal information. */
+typedef int (*dmadev_dump_t)(struct rte_dmadev *dev, FILE *f);
+
+/** @internal Used to start dmadev selftest. */
+typedef int (*dmadev_selftest_t)(uint16_t dev_id);
+
+/** @internal Used to enqueue a copy operation. */
+typedef int (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vchan,
+			     rte_iova_t src, rte_iova_t dst,
+			     uint32_t length, uint64_t flags);
+
+/** @internal Used to enqueue a scatter list copy operation. */
+typedef int (*dmadev_copy_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
+				const struct rte_dma_sg *sg,
+				uint32_t sg_len, uint64_t flags);
+
+/** @internal Used to enqueue a fill operation. */
+typedef int (*dmadev_fill_t)(struct rte_dmadev *dev, uint16_t vchan,
+			     uint64_t pattern, rte_iova_t dst,
+			     uint32_t length, uint64_t flags);
+
+/** @internal Used to enqueue a scatter list fill operation. */
+typedef int (*dmadev_fill_sg_t)(struct rte_dmadev *dev, uint16_t vchan,
+			uint64_t pattern, const struct rte_dma_sg *sg,
+			uint32_t sg_len, uint64_t flags);
+
+/** @internal Used to trigger hardware to begin working. */
+typedef int (*dmadev_submit_t)(struct rte_dmadev *dev, uint16_t vchan);
+
+/** @internal Used to return number of successful completed operations. */
+typedef uint16_t (*dmadev_completed_t)(struct rte_dmadev *dev, uint16_t vchan,
+				       const uint16_t nb_cpls,
+				       uint16_t *last_idx, bool *has_error);
+
+/** @internal Used to return number of failed completed operations. */
+typedef uint16_t (*dmadev_completed_fails_t)(struct rte_dmadev *dev,
+			uint16_t vchan, const uint16_t nb_status,
+			uint32_t *status, uint16_t *last_idx);
+
+/**
+ * DMA device operations function pointer table
+ */
+struct rte_dmadev_ops {
+	dmadev_info_get_t dev_info_get;
+	dmadev_configure_t dev_configure;
+	dmadev_start_t dev_start;
+	dmadev_stop_t dev_stop;
+	dmadev_close_t dev_close;
+	dmadev_reset_t dev_reset;
+	dmadev_vchan_setup_t vchan_setup;
+	dmadev_vchan_release_t vchan_release;
+	dmadev_stats_get_t stats_get;
+	dmadev_stats_reset_t stats_reset;
+	dmadev_dump_t dev_dump;
+	dmadev_selftest_t dev_selftest;
+};
+
+/**
+ * @internal
+ * The data part, with no function pointers, associated with each DMA device.
+ *
+ * This structure is safe to place in shared memory to be common among different
+ * processes in a multi-process configuration.
+ */
+struct rte_dmadev_data {
+	uint16_t dev_id; /**< Device [external] identifier. */
+	char dev_name[RTE_DMADEV_NAME_MAX_LEN]; /**< Unique identifier name */
+	void *dev_private; /**< PMD-specific private data. */
+	struct rte_dmadev_conf dev_conf; /**< DMA device configuration. */
+	uint8_t dev_started : 1; /**< Device state: STARTED(1)/STOPPED(0). */
+	uint64_t reserved[4]; /**< Reserved for future fields */
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ * The generic data structure associated with each DMA device.
+ *
+ * The dataplane APIs are located at the beginning of the structure, along
+ * with the pointer to where all the data elements for the particular device
+ * are stored in shared memory. This split scheme allows the function pointer
+ * and driver data to be per-process, while the actual configuration data for
+ * the device is shared.
+ */
+struct rte_dmadev {
+	dmadev_copy_t copy;
+	dmadev_copy_sg_t copy_sg;
+	dmadev_fill_t fill;
+	dmadev_fill_sg_t fill_sg;
+	dmadev_submit_t submit;
+	dmadev_completed_t completed;
+	dmadev_completed_fails_t completed_fails;
+	const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD. */
+	/** Flag indicating the device is attached: ATTACHED(1)/DETACHED(0). */
+	uint8_t attached : 1;
+	/** Device info which supplied during device initialization. */
+	struct rte_device *device;
+	struct rte_dmadev_data *data; /**< Pointer to device data. */
+	uint64_t reserved[4]; /**< Reserved for future fields */
+} __rte_cache_aligned;
+
+extern struct rte_dmadev rte_dmadevices[];
+
+#endif /* _RTE_DMADEV_CORE_H_ */
diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h
new file mode 100644
index 0000000..45141f9
--- /dev/null
+++ b/lib/dmadev/rte_dmadev_pmd.h
@@ -0,0 +1,72 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 HiSilicon Limited.
+ */
+
+#ifndef _RTE_DMADEV_PMD_H_
+#define _RTE_DMADEV_PMD_H_
+
+/**
+ * @file
+ *
+ * RTE DMA Device PMD APIs
+ *
+ * Driver facing APIs for a DMA device. These are not to be called directly by
+ * any application.
+ */
+
+#include "rte_dmadev.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @internal
+ * Allocates a new dmadev slot for an DMA device and returns the pointer
+ * to that slot for the driver to use.
+ *
+ * @param name
+ *   DMA device name.
+ *
+ * @return
+ *   A pointer to the DMA device slot case of success,
+ *   NULL otherwise.
+ */
+__rte_internal
+struct rte_dmadev *
+rte_dmadev_pmd_allocate(const char *name);
+
+/**
+ * @internal
+ * Release the specified dmadev.
+ *
+ * @param dev
+ *   Device to be released.
+ *
+ * @return
+ *   - 0 on success, negative on error
+ */
+__rte_internal
+int
+rte_dmadev_pmd_release(struct rte_dmadev *dev);
+
+/**
+ * @internal
+ * Return the DMA device based on the device name.
+ *
+ * @param name
+ *   DMA device name.
+ *
+ * @return
+ *   A pointer to the DMA device slot case of success,
+ *   NULL otherwise.
+ */
+__rte_internal
+struct rte_dmadev *
+rte_dmadev_get_device_by_name(const char *name);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DMADEV_PMD_H_ */
diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map
new file mode 100644
index 0000000..0f099e7
--- /dev/null
+++ b/lib/dmadev/version.map
@@ -0,0 +1,40 @@ 
+EXPERIMENTAL {
+	global:
+
+	rte_dmadev_count;
+	rte_dmadev_info_get;
+	rte_dmadev_configure;
+	rte_dmadev_start;
+	rte_dmadev_stop;
+	rte_dmadev_close;
+	rte_dmadev_reset;
+	rte_dmadev_vchan_setup;
+	rte_dmadev_vchan_release;
+	rte_dmadev_stats_get;
+	rte_dmadev_stats_reset;
+	rte_dmadev_dump;
+	rte_dmadev_selftest;
+	rte_dmadev_copy;
+	rte_dmadev_copy_sg;
+	rte_dmadev_fill;
+	rte_dmadev_fill_sg;
+	rte_dmadev_submit;
+	rte_dmadev_completed;
+	rte_dmadev_completed_fails;
+
+	local: *;
+};
+
+INTERNAL {
+        global:
+
+	rte_dmadevices;
+	rte_dmadev_pmd_allocate;
+	rte_dmadev_pmd_release;
+	rte_dmadev_get_device_by_name;
+
+	local:
+
+	rte_dmadev_is_valid_dev;
+};
+
diff --git a/lib/meson.build b/lib/meson.build
index 1673ca4..68d239f 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -60,6 +60,7 @@  libraries = [
         'bpf',
         'graph',
         'node',
+        'dmadev',
 ]
 
 if is_windows