[dpdk-dev] VFIO no-iommu

Yigit, Ferruh ferruh.yigit at intel.com
Fri Dec 18 11:43:10 CET 2015


On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote:
<...>
> > > > > > 
> > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl, because in
> > > > > > vfio
> > > > > > module,
> > > > > > container->noiommu is not set before doing a
> > > > > > vfio_group_set_container()
> > > > > > and vfio_for_each_iommu_driver selects wrong driver.
> > > > > 
> > > > > Running CHECK_EXTENSION on a container without the group
> > > > > attached is
> > > > > only going to tell you what extensions vfio is capable of, not
> > > > > necessarily what extensions are available to you with that
> > > > > group.
> > > > > Is this just a general dpdk- vfio ordering bug?
> > > > 
> > > > Yes, that is how VFIO was implemented in DPDK. I was under the
> > > > impression that checking extension before assigning devices was
> > > > the
> > > > correct way to do things, so as to not to try anything we know
> > > > would
> > > > fail anyway. Does this imply that CHECK_EXTENSION needs to be
> > > > called
> > > > on both container and groups (or just on groups)?
> > > 
> > > Hmm, in Documentation/vfio.txt we do give the following algorithm:
> > > 
> > >         if (ioctl(container, VFIO_GET_API_VERSION) !=
> > > VFIO_API_VERSION)
> > >                 /* Unknown API version */
> > > 
> > >         if (!ioctl(container, VFIO_CHECK_EXTENSION,
> > > VFIO_TYPE1_IOMMU))
> > >                 /* Doesn't support the IOMMU driver we want. */
> > >         ...
> > > 
> > > That's just going to query each iommu driver and we can't yet say
> > > whether
> > > the group the user attaches to the container later will actually
> > > support that
> > > extension until we try to do it, that would come at VFIO_SET_IOMMU.
> > >  So is
> > > it perhaps a vfio bug that we're not advertising no-iommu until the
> > > group is
> > > attached?  After all, we are capable of it with just an empty
> > > container, just
> > > like we are with type1, but we're going to fail SET_IOMMU for the
> > > wrong
> > > combination.
> > >  This is exactly the sort of thing that makes me glad we reverted
> > > it without
> > > feedback from a working user driver.  Thanks,
> > 
> > Whether it should be considered a "bug" in VFIO or "by design" is up
> > to you, of course, but at least according to the VFIO documentation,
> > we are meant to check for type 1 extension and then attach devices,
> > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as supported
> > even without any devices attached to the container (just like we get
> > type 1 as supported without any devices attached). Having said that,
> > if it was meant to attach devices first and then check the
> > extensions, then perhaps the documentation should also point out that
> > fact (or perhaps I missed that detail in my readings of the docs, in
> > which case my apologies).
> 
> Hi Anatoly,
> 
> Does the below patch make it behave more like you'd expect.  This
> applies to v4.4-rc4, I'd fold this into the base patch if we
> reincorporate it to a future kernel.  Thanks,
> 
> Alex
> 
> commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94
> Author: Alex Williamson <alex.williamson at redhat.com>
> Date:   Wed Dec 16 19:02:01 2015 -0700
> 
>     vfio: Fix no-iommu CHECK_EXTENSION
>     
>     Previously the no-iommu iommu driver was only visible when the
>     container had an attached no-iommu group.  This means that
>     CHECK_EXTENSION on and empty container couldn't report the possibility
>     of using VFIO_NOIOMMU_IOMMU.  We report TYPE1 whether or not the user
>     can make use of it with the group, so this is inconsistent.  Add the
>     no-iommu iommu to the list of iommu drivers when enabled via module
>     option, but skip all the others if the container is attached to a
>     no-iommu groups.  Note that tainting is now done with the "unsafe"
>     module callback rather than explictly within vfio.
>     
>     Also fixes module option and module description name inconsistency.
>     
>     Also make vfio_noiommu_ops const.
>     
>     Signed-off-by: Alex Williamson <alex.williamson at redhat.com>

Hi Alex,

I got following crash with this update:

[  +0.046973] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  +0.000031] IP: [<ffffffff813b92cf>] __list_add+V0x1f/0xc0
[  +0.000024] PGD 0 
[  +0.000010] Oops: 0000 [#1] SMP 
...
[  +0.000028] Call Trace:
[  +0.000014]  [<ffffffff81777301>] __mutex_lock_slowpath+0x91/0x110
[  +0.000022]  [<ffffffff817773a3>] mutex_lock+0x23/0x40
[  +0.000020]  [<ffffffffa0370c00>] vfio_register_iommu_driver+0x40/0xc0 [vfio]
[  +0.000025]  [<ffffffffa0370cd1>] noiommu_param_set+0x51/0x60 [vfio]
[  +0.000022]  [<ffffffff810bbe3e>] parse_args+0x1be/0x4a0
[  +0.000021]  [<ffffffff81121ea0>] load_module+0xe30/0x2730
[  +0.000942]  [<ffffffff8111f500>] ? __symbol_put+0x60/0x60
[  +0.000921]  [<ffffffff81223f60>] ? kernel_read+0x50/0x80
[  +0.000917]  [<ffffffff811239f9>] SyS_finit_module+0xb9/0xf0
[  +0.000925]  [<ffffffff817796ae>] entry_SYSCALL_64_fastpath+0x12/0x71

> -static struct vfio_iommu_driver vfio_noiommu_driver = {
> -	.ops = &vfio_noiommu_ops,
> +static int noiommu_param_set(const char *val, const struct kernel_param *kp)
> +{
> +	int ret;
> +
> +	if (!val)
> +		val = "1";
> +
> +	ret = strtobool(val, kp->arg);
> +	if (ret)
> +		return ret;
> +
> +	if (noiommu)
> +		ret = vfio_register_iommu_driver(&vfio_noiommu_ops);

Is it possible that kernel_param_ops->set() called before module_init() that initializes the mutex.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/module.c?id=refs/tags/v4.4-rc5#n3517

> +	else
> +		vfio_unregister_iommu_driver(&vfio_noiommu_ops);
> +
> +	return ret;
> +}
> +
> +static const struct kernel_param_ops noiommu_param_ops = {
> +	.flags = KERNEL_PARAM_OPS_FL_NOARG,
> +	.set = noiommu_param_set,
> +	.get = param_get_bool,
>  };
>  

Thanks,
ferruh


More information about the dev mailing list