[dpdk-dev] [PATCH v7 2/5] vfio: add multi container support
Wang, Xiao W
xiao.w.wang at intel.com
Mon Apr 16 14:44:13 CEST 2018
Hi Anatoly,
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Monday, April 16, 2018 6:03 PM
> To: Wang, Xiao W <xiao.w.wang at intel.com>; Yigit, Ferruh
> <ferruh.yigit at intel.com>
> Cc: dev at dpdk.org; maxime.coquelin at redhat.com; Wang, Zhihong
> <zhihong.wang at intel.com>; Bie, Tiwei <tiwei.bie at intel.com>; Tan, Jianfeng
> <jianfeng.tan at intel.com>; Liang, Cunming <cunming.liang at intel.com>; Daly,
> Dan <dan.daly at intel.com>; thomas at monjalon.net; Chen, Junjie J
> <junjie.j.chen at intel.com>
> Subject: Re: [PATCH v7 2/5] vfio: add multi container support
>
> On 15-Apr-18 4:33 PM, Xiao Wang wrote:
> > This patch adds APIs to support container create/destroy and device
> > bind/unbind with a container. It also provides API for IOMMU programing
> > on a specified container.
> >
> > A driver could use "rte_vfio_create_container" helper to create a
>
> ^^ wrong API name in commit message :)
Thanks for the catch. Will fix it.
>
> > new container from eal, use "rte_vfio_bind_group" to bind a device
> > to the newly created container. During rte_vfio_setup_device the
> > container bound with the device will be used for IOMMU setup.
> >
> > Signed-off-by: Junjie Chen <junjie.j.chen at intel.com>
> > Signed-off-by: Xiao Wang <xiao.w.wang at intel.com>
> > Reviewed-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> > Reviewed-by: Ferruh Yigit <ferruh.yigit at intel.com>
> > ---
> > lib/librte_eal/bsdapp/eal/eal.c | 52 +++++
> > lib/librte_eal/common/include/rte_vfio.h | 119 ++++++++++++
> > lib/librte_eal/linuxapp/eal/eal_vfio.c | 316
> +++++++++++++++++++++++++++++++
> > lib/librte_eal/rte_eal_version.map | 6 +
> > 4 files changed, 493 insertions(+)
> >
> > diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> > index 727adc5d2..c5106d0d6 100644
> > --- a/lib/librte_eal/bsdapp/eal/eal.c
> > +++ b/lib/librte_eal/bsdapp/eal/eal.c
> > @@ -769,6 +769,14 @@ int rte_vfio_noiommu_is_enabled(void);
> > int rte_vfio_clear_group(int vfio_group_fd);
> > int rte_vfio_dma_map(uint64_t vaddr, uint64_t iova, uint64_t len);
> > int rte_vfio_dma_unmap(uint64_t vaddr, uint64_t iova, uint64_t len);
> > +int rte_vfio_container_create(void);
> > +int rte_vfio_container_destroy(int container_fd);
> > +int rte_vfio_bind_group(int container_fd, int iommu_group_no);
> > +int rte_vfio_unbind_group(int container_fd, int iommu_group_no);
>
> Maybe have these under "container" too? e.g.
> rte_vfio_container_group_bind/unbind? Seems like it would be more
> consistent that way - anything to do with custom containers would be
> under rte_vfio_container_* namespace.
Agree.
>
> > +int rte_vfio_container_dma_map(int container_fd, uint64_t vaddr,
> > + uint64_t iova, uint64_t len);
> > +int rte_vfio_container_dma_unmap(int container_fd, uint64_t vaddr,
> > + uint64_t iova, uint64_t len);
> >
> > int rte_vfio_setup_device(__rte_unused const char *sysfs_base,
> > __rte_unused const char *dev_addr,
> > @@ -818,3 +826,47 @@ rte_vfio_dma_unmap(uint64_t __rte_unused vaddr,
> uint64_t __rte_unused iova,
> > {
> > return -1;
> > }
> > +
>
> <...>
>
> > diff --git a/lib/librte_eal/common/include/rte_vfio.h
> b/lib/librte_eal/common/include/rte_vfio.h
> > index d26ab01cb..0c1509b29 100644
> > --- a/lib/librte_eal/common/include/rte_vfio.h
> > +++ b/lib/librte_eal/common/include/rte_vfio.h
> > @@ -168,6 +168,125 @@ rte_vfio_dma_map(uint64_t vaddr, uint64_t iova,
> uint64_t len);
> > int __rte_experimental
> > rte_vfio_dma_unmap(uint64_t vaddr, uint64_t iova, uint64_t len);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Create a new container for device binding.
>
> I would add a note that any newly allocated DPDK memory will not be
> mapped into these containers by default.
Will add it.
>
> > + *
> > + * @return
> > + * the container fd if successful
> > + * <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_container_create(void);
> > +
>
> <...>
>
> > + * 0 if successful
> > + * <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_unbind_group(int container_fd, int iommu_group_no);
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma mapping for devices in a conainer.
>
> Here and in other places: "dma" should be DMA, and typo: "conainer" :)
>
> I think you should also add a note to the original API (not this one,
> but the old one) that DMA maps done via that API will only apply to
> default container and will not apply to any of the containers created
> via container_create(). IOW, documentation should make it clear that if
> you use this functionality, you're on your own and you have to manage
> your own DMA mappings for any containers you create.
OK, will add note to clearly describe it.
>
> > + *
> > + * @param container_fd
> > + * the specified container fd
> > + *
> > + * @param vaddr
> > + * Starting virtual address of memory to be mapped.
> > + *
>
> <...>
>
> > +
> > +int __rte_experimental
> > +rte_vfio_container_dma_map(int container_fd, uint64_t vaddr, uint64_t
> iova,
> > + uint64_t len)
> > +{
> > + struct user_mem_map *new_map;
> > + struct vfio_config *vfio_cfg;
> > + struct user_mem_maps *user_mem_maps;
> > + int ret = 0;
> > +
> > + if (len == 0) {
> > + rte_errno = EINVAL;
> > + return -1;
> > + }
> > +
> > + vfio_cfg = get_vfio_cfg_by_container_fd(container_fd);
> > + if (vfio_cfg == NULL) {
> > + RTE_LOG(ERR, EAL, "Invalid container fd\n");
> > + return -1;
> > + }
> > +
> > + user_mem_maps = &vfio_cfg->mem_maps;
> > + rte_spinlock_recursive_lock(&user_mem_maps->lock);
> > + if (user_mem_maps->n_maps == VFIO_MAX_USER_MEM_MAPS) {
> > + RTE_LOG(ERR, EAL, "No more space for user mem maps\n");
> > + rte_errno = ENOMEM;
> > + ret = -1;
> > + goto out;
> > + }
> > + /* map the entry */
> > + if (vfio_dma_mem_map(vfio_cfg, vaddr, iova, len, 1)) {
> > + /* technically, this will fail if there are currently no devices
> > + * plugged in, even if a device were added later, this mapping
> > + * might have succeeded. however, since we cannot verify if
> this
> > + * is a valid mapping without having a device attached,
> consider
> > + * this to be unsupported, because we can't just store any old
> > + * mapping and pollute list of active mappings willy-nilly.
> > + */
> > + RTE_LOG(ERR, EAL, "Couldn't map new region for DMA\n");
> > + ret = -1;
> > + goto out;
> > + }
> > + /* create new user mem map entry */
> > + new_map = &user_mem_maps->maps[user_mem_maps->n_maps++];
> > + new_map->addr = vaddr;
> > + new_map->iova = iova;
> > + new_map->len = len;
> > +
> > + compact_user_maps(user_mem_maps);
> > +out:
> > + rte_spinlock_recursive_unlock(&user_mem_maps->lock);
> > + return ret;
>
> Please correct me if i'm wrong, but it looks like you've just duplicated
> the code for rte_vfio_dma_map() here and made a few small changes. It
> would be better if you moved most of this into a static function (e.g.
> static int container_dma_map(vfio_cfg, vaddr, iova, len)) and called it
> with either default vfio_cfg from rte_vfio_dma_map, or found vfio_cfg
> from rte_vfio_container_dma_map. Same applies to function below.
Agree, will do it in v8.
BRs,
Xiao
>
> > +}
> > +
> > +int __rte_experimental
> > +rte_vfio_container_dma_unmap(int container_fd, uint64_t vaddr, uint64_t
> iova,
> > + uint64_t len)
> > +{
> > + struct user_mem_map *map, *new_map = NULL;
> > + struct vfio_config *vfio_cfg;
> > + struct user_mem_maps *user_mem_maps;
> > + int ret = 0;
> > +
> > + if (len == 0) {
> > + rte_errno = EINVAL;
> > + return -1;
> > + }
> > +
>
> <...>
>
> --
> Thanks,
> Anatoly
More information about the dev
mailing list