[dpdk-dev] VFIO no-iommu

Message ID 1450475417.2674.167.camel@redhat.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Alex Williamson Dec. 18, 2015, 9:50 p.m. UTC
  On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote:
> On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote:
> > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson wrote:
> > <...>
> > > > > > > > 
> > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl,
> > > > > > > > because in
> > > > > > > > vfio
> > > > > > > > module,
> > > > > > > > container->noiommu is not set before doing a
> > > > > > > > vfio_group_set_container()
> > > > > > > > and vfio_for_each_iommu_driver selects wrong driver.
> > > > > > > 
> > > > > > > Running CHECK_EXTENSION on a container without the group
> > > > > > > attached is
> > > > > > > only going to tell you what extensions vfio is capable
> > > > > > > of,
> > > > > > > not
> > > > > > > necessarily what extensions are available to you with
> > > > > > > that
> > > > > > > group.
> > > > > > > Is this just a general dpdk- vfio ordering bug?
> > > > > > 
> > > > > > Yes, that is how VFIO was implemented in DPDK. I was under
> > > > > > the
> > > > > > impression that checking extension before assigning devices
> > > > > > was
> > > > > > the
> > > > > > correct way to do things, so as to not to try anything we
> > > > > > know
> > > > > > would
> > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs to
> > > > > > be
> > > > > > called
> > > > > > on both container and groups (or just on groups)?
> > > > > 
> > > > > Hmm, in Documentation/vfio.txt we do give the following
> > > > > algorithm:
> > > > > 
> > > > >         if (ioctl(container, VFIO_GET_API_VERSION) !=
> > > > > VFIO_API_VERSION)
> > > > >                 /* Unknown API version */
> > > > > 
> > > > >         if (!ioctl(container, VFIO_CHECK_EXTENSION,
> > > > > VFIO_TYPE1_IOMMU))
> > > > >                 /* Doesn't support the IOMMU driver we want.
> > > > > */
> > > > >         ...
> > > > > 
> > > > > That's just going to query each iommu driver and we can't yet
> > > > > say
> > > > > whether
> > > > > the group the user attaches to the container later will
> > > > > actually
> > > > > support that
> > > > > extension until we try to do it, that would come at
> > > > > VFIO_SET_IOMMU.
> > > > >  So is
> > > > > it perhaps a vfio bug that we're not advertising no-iommu
> > > > > until
> > > > > the
> > > > > group is
> > > > > attached?  After all, we are capable of it with just an empty
> > > > > container, just
> > > > > like we are with type1, but we're going to fail SET_IOMMU for
> > > > > the
> > > > > wrong
> > > > > combination.
> > > > >  This is exactly the sort of thing that makes me glad we
> > > > > reverted
> > > > > it without
> > > > > feedback from a working user driver.  Thanks,
> > > > 
> > > > Whether it should be considered a "bug" in VFIO or "by design"
> > > > is
> > > > up
> > > > to you, of course, but at least according to the VFIO
> > > > documentation,
> > > > we are meant to check for type 1 extension and then attach
> > > > devices,
> > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as
> > > > supported
> > > > even without any devices attached to the container (just like
> > > > we
> > > > get
> > > > type 1 as supported without any devices attached). Having said
> > > > that,
> > > > if it was meant to attach devices first and then check the
> > > > extensions, then perhaps the documentation should also point
> > > > out
> > > > that
> > > > fact (or perhaps I missed that detail in my readings of the
> > > > docs,
> > > > in
> > > > which case my apologies).
> > > 
> > > Hi Anatoly,
> > > 
> > > Does the below patch make it behave more like you'd expect.  This
> > > applies to v4.4-rc4, I'd fold this into the base patch if we
> > > reincorporate it to a future kernel.  Thanks,
> > > 
> > > Alex
> > > 
> > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94
> > > Author: Alex Williamson <alex.williamson@redhat.com>
> > > Date:   Wed Dec 16 19:02:01 2015 -0700
> > > 
> > >     vfio: Fix no-iommu CHECK_EXTENSION
> > >     
> > >     Previously the no-iommu iommu driver was only visible when
> > > the
> > >     container had an attached no-iommu group.  This means that
> > >     CHECK_EXTENSION on and empty container couldn't report the
> > > possibility
> > >     of using VFIO_NOIOMMU_IOMMU.  We report TYPE1 whether or not
> > > the user
> > >     can make use of it with the group, so this is
> > > inconsistent.  Add the
> > >     no-iommu iommu to the list of iommu drivers when enabled via
> > > module
> > >     option, but skip all the others if the container is attached
> > > to
> > > a
> > >     no-iommu groups.  Note that tainting is now done with the
> > > "unsafe"
> > >     module callback rather than explictly within vfio.
> > >     
> > >     Also fixes module option and module description name
> > > inconsistency.
> > >     
> > >     Also make vfio_noiommu_ops const.
> > >     
> > >     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > 
> > Hi Alex,
> > 
> > I got following crash with this update:

Let's try this one:

commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Fri Dec 18 14:45:55 2015 -0700

    v2 vfio fix no-iommu CHECK_EXTENSION
    
    Register and unregister the no-iommu iommu backend at module
    initialization and exit, but disable it unless enabled via module
    option.  Rather than modify the iommu driver walk, selectively skip
    combinations that aren't supported.  CHECK_EXTENSION on a container
    without any groups attached exposes all possible extensions.  Once a
    group is attached, the no-iommu backend is skipped for regular groups
    and regular iommu backends are skipped for no-iommu groups.
    
    This would be folded into a single patch to re-propose vfio no-iommu
    mode upstream.
    
    Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
  

Comments

Alex Williamson Dec. 21, 2015, 7:22 p.m. UTC | #1
On Mon, 2015-12-21 at 11:46 +0000, Yigit, Ferruh wrote:
> On Fri, Dec 18, 2015 at 02:50:17PM -0700, Alex Williamson wrote:
> > On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote:
> > > On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote:
> > > > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson
> > > > wrote:
> > > > <...>
> > > > > > > > > > 
> > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION ioctl,
> > > > > > > > > > because in
> > > > > > > > > > vfio
> > > > > > > > > > module,
> > > > > > > > > > container->noiommu is not set before doing a
> > > > > > > > > > vfio_group_set_container()
> > > > > > > > > > and vfio_for_each_iommu_driver selects wrong
> > > > > > > > > > driver.
> > > > > > > > > 
> > > > > > > > > Running CHECK_EXTENSION on a container without the
> > > > > > > > > group
> > > > > > > > > attached is
> > > > > > > > > only going to tell you what extensions vfio is
> > > > > > > > > capable
> > > > > > > > > of,
> > > > > > > > > not
> > > > > > > > > necessarily what extensions are available to you with
> > > > > > > > > that
> > > > > > > > > group.
> > > > > > > > > Is this just a general dpdk- vfio ordering bug?
> > > > > > > > 
> > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was
> > > > > > > > under
> > > > > > > > the
> > > > > > > > impression that checking extension before assigning
> > > > > > > > devices
> > > > > > > > was
> > > > > > > > the
> > > > > > > > correct way to do things, so as to not to try anything
> > > > > > > > we
> > > > > > > > know
> > > > > > > > would
> > > > > > > > fail anyway. Does this imply that CHECK_EXTENSION needs
> > > > > > > > to
> > > > > > > > be
> > > > > > > > called
> > > > > > > > on both container and groups (or just on groups)?
> > > > > > > 
> > > > > > > Hmm, in Documentation/vfio.txt we do give the following
> > > > > > > algorithm:
> > > > > > > 
> > > > > > >         if (ioctl(container, VFIO_GET_API_VERSION) !=
> > > > > > > VFIO_API_VERSION)
> > > > > > >                 /* Unknown API version */
> > > > > > > 
> > > > > > >         if (!ioctl(container, VFIO_CHECK_EXTENSION,
> > > > > > > VFIO_TYPE1_IOMMU))
> > > > > > >                 /* Doesn't support the IOMMU driver we
> > > > > > > want.
> > > > > > > */
> > > > > > >         ...
> > > > > > > 
> > > > > > > That's just going to query each iommu driver and we can't
> > > > > > > yet
> > > > > > > say
> > > > > > > whether
> > > > > > > the group the user attaches to the container later will
> > > > > > > actually
> > > > > > > support that
> > > > > > > extension until we try to do it, that would come at
> > > > > > > VFIO_SET_IOMMU.
> > > > > > >  So is
> > > > > > > it perhaps a vfio bug that we're not advertising no-iommu
> > > > > > > until
> > > > > > > the
> > > > > > > group is
> > > > > > > attached?  After all, we are capable of it with just an
> > > > > > > empty
> > > > > > > container, just
> > > > > > > like we are with type1, but we're going to fail SET_IOMMU
> > > > > > > for
> > > > > > > the
> > > > > > > wrong
> > > > > > > combination.
> > > > > > >  This is exactly the sort of thing that makes me glad we
> > > > > > > reverted
> > > > > > > it without
> > > > > > > feedback from a working user driver.  Thanks,
> > > > > > 
> > > > > > Whether it should be considered a "bug" in VFIO or "by
> > > > > > design"
> > > > > > is
> > > > > > up
> > > > > > to you, of course, but at least according to the VFIO
> > > > > > documentation,
> > > > > > we are meant to check for type 1 extension and then attach
> > > > > > devices,
> > > > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked as
> > > > > > supported
> > > > > > even without any devices attached to the container (just
> > > > > > like
> > > > > > we
> > > > > > get
> > > > > > type 1 as supported without any devices attached). Having
> > > > > > said
> > > > > > that,
> > > > > > if it was meant to attach devices first and then check the
> > > > > > extensions, then perhaps the documentation should also
> > > > > > point
> > > > > > out
> > > > > > that
> > > > > > fact (or perhaps I missed that detail in my readings of the
> > > > > > docs,
> > > > > > in
> > > > > > which case my apologies).
> > > > > 
> > > > > Hi Anatoly,
> > > > > 
> > > > > Does the below patch make it behave more like you'd expect.
> > > > >  This
> > > > > applies to v4.4-rc4, I'd fold this into the base patch if we
> > > > > reincorporate it to a future kernel.  Thanks,
> > > > > 
> > > > > Alex
> > > > > 
> > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94
> > > > > Author: Alex Williamson <alex.williamson@redhat.com>
> > > > > Date:   Wed Dec 16 19:02:01 2015 -0700
> > > > > 
> > > > >     vfio: Fix no-iommu CHECK_EXTENSION
> > > > >     
> > > > >     Previously the no-iommu iommu driver was only visible
> > > > > when
> > > > > the
> > > > >     container had an attached no-iommu group.  This means
> > > > > that
> > > > >     CHECK_EXTENSION on and empty container couldn't report
> > > > > the
> > > > > possibility
> > > > >     of using VFIO_NOIOMMU_IOMMU.  We report TYPE1 whether or
> > > > > not
> > > > > the user
> > > > >     can make use of it with the group, so this is
> > > > > inconsistent.  Add the
> > > > >     no-iommu iommu to the list of iommu drivers when enabled
> > > > > via
> > > > > module
> > > > >     option, but skip all the others if the container is
> > > > > attached
> > > > > to
> > > > > a
> > > > >     no-iommu groups.  Note that tainting is now done with the
> > > > > "unsafe"
> > > > >     module callback rather than explictly within vfio.
> > > > >     
> > > > >     Also fixes module option and module description name
> > > > > inconsistency.
> > > > >     
> > > > >     Also make vfio_noiommu_ops const.
> > > > >     
> > > > >     Signed-off-by: Alex Williamson <alex.williamson@redhat.co
> > > > > m>
> > > > 
> > > > Hi Alex,
> > > > 
> > > > I got following crash with this update:
> > 
> > Let's try this one:
> > 
> > commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05
> > Author: Alex Williamson <alex.williamson@redhat.com>
> > Date:   Fri Dec 18 14:45:55 2015 -0700
> > 
> >     v2 vfio fix no-iommu CHECK_EXTENSION
> >     
> >     Register and unregister the no-iommu iommu backend at module
> >     initialization and exit, but disable it unless enabled via
> > module
> >     option.  Rather than modify the iommu driver walk, selectively
> > skip
> >     combinations that aren't supported.  CHECK_EXTENSION on a
> > container
> >     without any groups attached exposes all possible
> > extensions.  Once a
> >     group is attached, the no-iommu backend is skipped for regular
> > groups
> >     and regular iommu backends are skipped for no-iommu groups.
> >     
> >     This would be folded into a single patch to re-propose vfio no-
> > iommu
> >     mode upstream.
> >     
> >     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > 
> Hi Alex,
> 
> Thank you for the update. I have tested this on both no-iommu and
> iommu environment
> and worked successfully. I believe this approach is better because it
> is simpler.
> 
> From DPDK point of view, only update to support vfio no-iommu is: to
> use new group names
> and disable DMA mapping.
> 
> If VFIO module compiled with "CONFIG_VFIO_NOIOMMU=y" by default, that
> makes things easier
> for DPDK, in no-iommu environment inserting vfio module with proper
> parameter makes it
> available for DPDK.

Thanks for the update Ferruh.  Also note that the module option is
dynamically settable so that it can support running with statically
compiled modules where you may not know whether or not to enable no-
iommu until after boot, or where unloading and re-loading a module
might not be an option.  It will be up to each distro to decide whether
to enable the config option, but I think we at least highlight the
existing support issue for non-iommu protected userspace drivers, which
is something that was not at all clear with the uio driver approach.
 Thanks,

Alex
  
Alex Williamson Dec. 22, 2015, 8:20 p.m. UTC | #2
On Mon, 2015-12-21 at 12:22 -0700, Alex Williamson wrote:
> On Mon, 2015-12-21 at 11:46 +0000, Yigit, Ferruh wrote:
> > On Fri, Dec 18, 2015 at 02:50:17PM -0700, Alex Williamson wrote:
> > > On Fri, 2015-12-18 at 07:38 -0700, Alex Williamson wrote:
> > > > On Fri, 2015-12-18 at 10:43 +0000, Yigit, Ferruh wrote:
> > > > > On Thu, Dec 17, 2015 at 09:43:59AM -0700, Alex Williamson
> > > > > wrote:
> > > > > <...>
> > > > > > > > > > > 
> > > > > > > > > > > Also I need to disable VFIO_CHECK_EXTENSION
> > > > > > > > > > > ioctl,
> > > > > > > > > > > because in
> > > > > > > > > > > vfio
> > > > > > > > > > > module,
> > > > > > > > > > > container->noiommu is not set before doing a
> > > > > > > > > > > vfio_group_set_container()
> > > > > > > > > > > and vfio_for_each_iommu_driver selects wrong
> > > > > > > > > > > driver.
> > > > > > > > > > 
> > > > > > > > > > Running CHECK_EXTENSION on a container without the
> > > > > > > > > > group
> > > > > > > > > > attached is
> > > > > > > > > > only going to tell you what extensions vfio is
> > > > > > > > > > capable
> > > > > > > > > > of,
> > > > > > > > > > not
> > > > > > > > > > necessarily what extensions are available to you
> > > > > > > > > > with
> > > > > > > > > > that
> > > > > > > > > > group.
> > > > > > > > > > Is this just a general dpdk- vfio ordering bug?
> > > > > > > > > 
> > > > > > > > > Yes, that is how VFIO was implemented in DPDK. I was
> > > > > > > > > under
> > > > > > > > > the
> > > > > > > > > impression that checking extension before assigning
> > > > > > > > > devices
> > > > > > > > > was
> > > > > > > > > the
> > > > > > > > > correct way to do things, so as to not to try
> > > > > > > > > anything
> > > > > > > > > we
> > > > > > > > > know
> > > > > > > > > would
> > > > > > > > > fail anyway. Does this imply that CHECK_EXTENSION
> > > > > > > > > needs
> > > > > > > > > to
> > > > > > > > > be
> > > > > > > > > called
> > > > > > > > > on both container and groups (or just on groups)?
> > > > > > > > 
> > > > > > > > Hmm, in Documentation/vfio.txt we do give the following
> > > > > > > > algorithm:
> > > > > > > > 
> > > > > > > >         if (ioctl(container, VFIO_GET_API_VERSION) !=
> > > > > > > > VFIO_API_VERSION)
> > > > > > > >                 /* Unknown API version */
> > > > > > > > 
> > > > > > > >         if (!ioctl(container, VFIO_CHECK_EXTENSION,
> > > > > > > > VFIO_TYPE1_IOMMU))
> > > > > > > >                 /* Doesn't support the IOMMU driver we
> > > > > > > > want.
> > > > > > > > */
> > > > > > > >         ...
> > > > > > > > 
> > > > > > > > That's just going to query each iommu driver and we
> > > > > > > > can't
> > > > > > > > yet
> > > > > > > > say
> > > > > > > > whether
> > > > > > > > the group the user attaches to the container later will
> > > > > > > > actually
> > > > > > > > support that
> > > > > > > > extension until we try to do it, that would come at
> > > > > > > > VFIO_SET_IOMMU.
> > > > > > > >  So is
> > > > > > > > it perhaps a vfio bug that we're not advertising no-
> > > > > > > > iommu
> > > > > > > > until
> > > > > > > > the
> > > > > > > > group is
> > > > > > > > attached?  After all, we are capable of it with just an
> > > > > > > > empty
> > > > > > > > container, just
> > > > > > > > like we are with type1, but we're going to fail
> > > > > > > > SET_IOMMU
> > > > > > > > for
> > > > > > > > the
> > > > > > > > wrong
> > > > > > > > combination.
> > > > > > > >  This is exactly the sort of thing that makes me glad
> > > > > > > > we
> > > > > > > > reverted
> > > > > > > > it without
> > > > > > > > feedback from a working user driver.  Thanks,
> > > > > > > 
> > > > > > > Whether it should be considered a "bug" in VFIO or "by
> > > > > > > design"
> > > > > > > is
> > > > > > > up
> > > > > > > to you, of course, but at least according to the VFIO
> > > > > > > documentation,
> > > > > > > we are meant to check for type 1 extension and then
> > > > > > > attach
> > > > > > > devices,
> > > > > > > so it would be expected to get VFIO_NOIOMMU_IOMMU marked
> > > > > > > as
> > > > > > > supported
> > > > > > > even without any devices attached to the container (just
> > > > > > > like
> > > > > > > we
> > > > > > > get
> > > > > > > type 1 as supported without any devices attached). Having
> > > > > > > said
> > > > > > > that,
> > > > > > > if it was meant to attach devices first and then check
> > > > > > > the
> > > > > > > extensions, then perhaps the documentation should also
> > > > > > > point
> > > > > > > out
> > > > > > > that
> > > > > > > fact (or perhaps I missed that detail in my readings of
> > > > > > > the
> > > > > > > docs,
> > > > > > > in
> > > > > > > which case my apologies).
> > > > > > 
> > > > > > Hi Anatoly,
> > > > > > 
> > > > > > Does the below patch make it behave more like you'd expect.
> > > > > >  This
> > > > > > applies to v4.4-rc4, I'd fold this into the base patch if
> > > > > > we
> > > > > > reincorporate it to a future kernel.  Thanks,
> > > > > > 
> > > > > > Alex
> > > > > > 
> > > > > > commit 88d4dcb6b77624965f0b45b5cd305a2b4a105c94
> > > > > > Author: Alex Williamson <alex.williamson@redhat.com>
> > > > > > Date:   Wed Dec 16 19:02:01 2015 -0700
> > > > > > 
> > > > > >     vfio: Fix no-iommu CHECK_EXTENSION
> > > > > >     
> > > > > >     Previously the no-iommu iommu driver was only visible
> > > > > > when
> > > > > > the
> > > > > >     container had an attached no-iommu group.  This means
> > > > > > that
> > > > > >     CHECK_EXTENSION on and empty container couldn't report
> > > > > > the
> > > > > > possibility
> > > > > >     of using VFIO_NOIOMMU_IOMMU.  We report TYPE1 whether
> > > > > > or
> > > > > > not
> > > > > > the user
> > > > > >     can make use of it with the group, so this is
> > > > > > inconsistent.  Add the
> > > > > >     no-iommu iommu to the list of iommu drivers when
> > > > > > enabled
> > > > > > via
> > > > > > module
> > > > > >     option, but skip all the others if the container is
> > > > > > attached
> > > > > > to
> > > > > > a
> > > > > >     no-iommu groups.  Note that tainting is now done with
> > > > > > the
> > > > > > "unsafe"
> > > > > >     module callback rather than explictly within vfio.
> > > > > >     
> > > > > >     Also fixes module option and module description name
> > > > > > inconsistency.
> > > > > >     
> > > > > >     Also make vfio_noiommu_ops const.
> > > > > >     
> > > > > >     Signed-off-by: Alex Williamson <alex.williamson@redhat.
> > > > > > co
> > > > > > m>
> > > > > 
> > > > > Hi Alex,
> > > > > 
> > > > > I got following crash with this update:
> > > 
> > > Let's try this one:
> > > 
> > > commit 8ff839c6ffe9f3b50b50f1cc87e7afbf23171f05
> > > Author: Alex Williamson <alex.williamson@redhat.com>
> > > Date:   Fri Dec 18 14:45:55 2015 -0700
> > > 
> > >     v2 vfio fix no-iommu CHECK_EXTENSION
> > >     
> > >     Register and unregister the no-iommu iommu backend at module
> > >     initialization and exit, but disable it unless enabled via
> > > module
> > >     option.  Rather than modify the iommu driver walk,
> > > selectively
> > > skip
> > >     combinations that aren't supported.  CHECK_EXTENSION on a
> > > container
> > >     without any groups attached exposes all possible
> > > extensions.  Once a
> > >     group is attached, the no-iommu backend is skipped for
> > > regular
> > > groups
> > >     and regular iommu backends are skipped for no-iommu groups.
> > >     
> > >     This would be folded into a single patch to re-propose vfio
> > > no-
> > > iommu
> > >     mode upstream.
> > >     
> > >     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > 
> > Hi Alex,
> > 
> > Thank you for the update. I have tested this on both no-iommu and
> > iommu environment
> > and worked successfully. I believe this approach is better because
> > it
> > is simpler.
> > 
> > From DPDK point of view, only update to support vfio no-iommu is:
> > to
> > use new group names
> > and disable DMA mapping.
> > 
> > If VFIO module compiled with "CONFIG_VFIO_NOIOMMU=y" by default,
> > that
> > makes things easier
> > for DPDK, in no-iommu environment inserting vfio module with proper
> > parameter makes it
> > available for DPDK.
> 
> Thanks for the update Ferruh.  Also note that the module option is
> dynamically settable so that it can support running with statically
> compiled modules where you may not know whether or not to enable no-
> iommu until after boot, or where unloading and re-loading a module
> might not be an option.  It will be up to each distro to decide
> whether
> to enable the config option, but I think we at least highlight the
> existing support issue for non-iommu protected userspace drivers,
> which
> is something that was not at all clear with the uio driver approach.


Hi,

I've re-posted the unified patch upstream and it should start showing
up in the next linux-next build.  I expect the dpdk code won't be
merged until after this gets back into a proper kernel, but could we
get the dpdk modifications posted as rfc for others looking to try it?
 Thanks,

Alex
  
Anatoly Burakov Dec. 23, 2015, 11:19 a.m. UTC | #3
Hi Alex,

> I've re-posted the unified patch upstream and it should start showing up in

> the next linux-next build.  I expect the dpdk code won't be merged until

> after this gets back into a proper kernel, but could we get the dpdk

> modifications posted as rfc for others looking to try it?


I have already posted a patch that should work with No-IOMMU.

http://dpdk.org/dev/patchwork/patch/9619/

Apologies for not CC-ing you. I too would be interested to know if other people are having any issues with the patch.

Thanks,
Anatoly
  
Santosh Shukla Dec. 31, 2015, 2:30 p.m. UTC | #4
On Wed, Dec 23, 2015 at 4:49 PM, Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
> Hi Alex,
>
>> I've re-posted the unified patch upstream and it should start showing up in
>> the next linux-next build.  I expect the dpdk code won't be merged until
>> after this gets back into a proper kernel, but could we get the dpdk
>> modifications posted as rfc for others looking to try it?
>
> I have already posted a patch that should work with No-IOMMU.
>
> http://dpdk.org/dev/patchwork/patch/9619/
>
> Apologies for not CC-ing you. I too would be interested to know if other people are having any issues with the patch.
>

I tried this patch for virtio-net pmd driver on arm64 and It worked
for me.  I didn't reviewed patch, but functionally nothing broke in my
test environment, we'll review as time permit.. for now

Tested-by: Santosh Shukla <sshukla@mvista.com>


> Thanks,
> Anatoly
  

Patch

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index de632da..85a5793 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -99,7 +99,7 @@  struct vfio_device {
 
 #ifdef CONFIG_VFIO_NOIOMMU
 static bool noiommu __read_mostly;
-module_param_named(enable_unsafe_noiommu_support,
+module_param_named(enable_unsafe_noiommu_mode,
 		   noiommu, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(enable_unsafe_noiommu_mode, "Enable UNSAFE, no-IOMMU mode.  This mode provides no device isolation, no DMA translation, no host kernel protection, cannot be used for device assignment to virtual machines, requires RAWIO permissions, and will taint the kernel.  If you do not know what this is for, step away. (default: false)");
 #endif
@@ -185,7 +185,7 @@  static long vfio_noiommu_ioctl(void *iommu_data,
 			       unsigned int cmd, unsigned long arg)
 {
 	if (cmd == VFIO_CHECK_EXTENSION)
-		return arg == VFIO_NOIOMMU_IOMMU ? 1 : 0;
+		return noiommu && (arg == VFIO_NOIOMMU_IOMMU) ? 1 : 0;
 
 	return -ENOTTY;
 }
@@ -207,7 +207,7 @@  static void vfio_noiommu_detach_group(void *iommu_data,
 {
 }
 
-static struct vfio_iommu_driver_ops vfio_noiommu_ops = {
+static const struct vfio_iommu_driver_ops vfio_noiommu_ops = {
 	.name = "vfio-noiommu",
 	.owner = THIS_MODULE,
 	.open = vfio_noiommu_open,
@@ -216,25 +216,6 @@  static struct vfio_iommu_driver_ops vfio_noiommu_ops = {
 	.attach_group = vfio_noiommu_attach_group,
 	.detach_group = vfio_noiommu_detach_group,
 };
-
-static struct vfio_iommu_driver vfio_noiommu_driver = {
-	.ops = &vfio_noiommu_ops,
-};
-
-/*
- * Wrap IOMMU drivers, the noiommu driver is the one and only driver for
- * noiommu groups (and thus containers) and not available for normal groups.
- */
-#define vfio_for_each_iommu_driver(con, pos)				\
-	for (pos = con->noiommu ? &vfio_noiommu_driver :		\
-	     list_first_entry(&vfio.iommu_drivers_list,			\
-			      struct vfio_iommu_driver, vfio_next);	\
-	     (con->noiommu ? pos != NULL :				\
-			&pos->vfio_next != &vfio.iommu_drivers_list);	\
-	      pos = con->noiommu ? NULL : list_next_entry(pos, vfio_next))
-#else
-#define vfio_for_each_iommu_driver(con, pos)				\
-	list_for_each_entry(pos, &vfio.iommu_drivers_list, vfio_next)
 #endif
 
 
@@ -999,7 +980,14 @@  static long vfio_ioctl_check_extension(struct vfio_container *container,
 		 */
 		if (!driver) {
 			mutex_lock(&vfio.iommu_drivers_lock);
-			vfio_for_each_iommu_driver(container, driver) {
+			list_for_each_entry(driver, &vfio.iommu_drivers_list,
+					    vfio_next) {
+
+				if (!list_empty(&container->group_list) &&
+				    (container->noiommu !=
+				     (driver->ops == &vfio_noiommu_ops)))
+					continue;
+
 				if (!try_module_get(driver->ops->owner))
 					continue;
 
@@ -1068,9 +1056,16 @@  static long vfio_ioctl_set_iommu(struct vfio_container *container,
 	}
 
 	mutex_lock(&vfio.iommu_drivers_lock);
-	vfio_for_each_iommu_driver(container, driver) {
+	list_for_each_entry(driver, &vfio.iommu_drivers_list, vfio_next) {
 		void *data;
 
+		/*
+		 * Only noiommu containers can use vfio-noiommu and noiommu
+		 * containers can only use vfio-noiommu.
+		 */
+		if (container->noiommu != (driver->ops == &vfio_noiommu_ops))
+			continue;
+
 		if (!try_module_get(driver->ops->owner))
 			continue;
 
@@ -1799,6 +1794,9 @@  static int __init vfio_init(void)
 	request_module_nowait("vfio_iommu_type1");
 	request_module_nowait("vfio_iommu_spapr_tce");
 
+#ifdef CONFIG_VFIO_NOIOMMU
+	vfio_register_iommu_driver(&vfio_noiommu_ops);
+#endif
 	return 0;
 
 err_cdev_add:
@@ -1815,6 +1813,9 @@  static void __exit vfio_cleanup(void)
 {
 	WARN_ON(!list_empty(&vfio.group_list));
 
+#ifdef CONFIG_VFIO_NOIOMMU
+	vfio_unregister_iommu_driver(&vfio_noiommu_ops);
+#endif
 	idr_destroy(&vfio.group_idr);
 	cdev_del(&vfio.group_cdev);
 	unregister_chrdev_region(vfio.group_devt, MINORMASK);