[dpdk-dev] [PATCH v6 08/11] eal: pci: introduce RTE_KDRV_VFIO_NOIOMMUi driver mode

Santosh Shukla sshukla at mvista.com
Wed Jan 27 16:32:45 CET 2016


On Wed, Jan 27, 2016 at 4:11 PM, Santosh Shukla <sshukla at mvista.com> wrote:
> On Tue, Jan 26, 2016 at 9:51 PM, Santosh Shukla <sshukla at mvista.com> wrote:
>> On Tue, Jan 26, 2016 at 7:58 PM, Thomas Monjalon
>> <thomas.monjalon at 6wind.com> wrote:
>>> 2016-01-26 19:35, Santosh Shukla:
>>>> On Tue, Jan 26, 2016 at 6:30 PM, Thomas Monjalon
>>>> <thomas.monjalon at 6wind.com> wrote:
>>>> > 2016-01-26 15:56, Santosh Shukla:
>>>> >> In my observation, currently virtio work for vfio-noiommu, that's why
>>>> >> said drv->kdrv need to know vfio mode.
>>>> >
>>>> > It is your observation. It may change in near future.
>>>>
>>>> so that mean till then, virtio support for non-x86 arch has to wait?
>>>
>>> No, absolutely not. virtio for non-x86 is welcome.
>>>
>>>> We have working model with vfio-noiommu, don't you think it make sense
>>>> to let vfio_noiommu implementation exist and later in-case
>>>> virtio+iommu gets mainline then switch to vfio __mode__ agnostic
>>>> approach. And for that All it takes to replace __noiommu suffix with
>>>> default.
>>>
>>> I'm just saying you should not touch the enum rte_kernel_driver.
>>> RTE_KDRV_VFIO is a driver.
>>> RTE_KDRV_VFIO_NOIOMMU is a mode.
>>> As the VFIO API is the same in both modes, there is no reason to
>>> distinguish them at this level.
>>> Your patch adds the NOIOMMU case everywhere:
>>>         case RTE_KDRV_VFIO:
>>> +       case RTE_KDRV_VFIO_NOIOMMU:
>>>
>>> I'll stop commenting here to let others give their opinion.
>>>
>>> [...]
>>>> >> with vfio+iommu; binding virtio pci device to vfio-pci driver fail;
>>>> >> giving below error:
>>>> >> [   53.053464] VFIO - User Level meta-driver version: 0.3
>>>> >> [   73.077805] vfio-pci: probe of 0000:00:03.0 failed with error -22
>>>> >> [   73.077852] vfio-pci: probe of 0000:00:03.0 failed with error -22
>>>> >>
>>>> >> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get()
>>>> >> fails: iommu doesn't have group for virtio pci device.
>>>> >
>>>> > Yes it fails when binding.
>>>> > So the later check in the virtio PMD is useless.
>>>>
>>>> Which check?
>>>
>>> The check for VFIO noiommu only:
>>> -       if (dev->kdrv == RTE_KDRV_VFIO)
>>> +       if (dev->kdrv == RTE_KDRV_VFIO_NOIOMMU)
>>>
>>> [...]
>>>> > Furthermore restricting virtio to no-iommu mode doesn't bring
>>>> > any improvement.
>>>>
>>>> We're not __restricting__, as soon as virtio+iommu gets working state,
>>>> we'll simply replace __noiommu with default. Then its upto user to try
>>>> out virtio with vfio default or vfio_noiommu.
>>>
>>> Yes it's up to user.
>>> So your code should be
>>>         if (dev->kdrv == RTE_KDRV_VFIO)
>>>
>>
>> Right,
>>
>>>> > That's why I suggest to keep the initial semantic of kdrv and
>>>> > not pollute it with VFIO modes.
>>>>
>>>> I am okay to live with default and forget suffix __noiommu but there
>>>> are implementation problem which was discussed in other thread
>>>> - Virtio pmd driver should avoid interface parsing i.e.
>>>> virtio_resource_init_uio/vfio() etc.. For vfio case - We could easily
>>>> get rid of by moving /sys parsing to pci_eal layer, Right? If so then
>>>> virtio currently works with vfio-noiommu, it make sense to me that
>>>> pci_eal layer does parsing for pmd driver before that pmd driver get
>>>> initialized.
>>>
>>> Please reword. What is the problem?
>>>
>>>> - Another case could be: iommu-less-pmd-driver. eal layer to do
>>>> parsing before updating drv->kdrv.
>>>
>>> [...]
>>>> >> >> > If a check is needed, I would prefer using your function
>>>> >> >> > pci_vfio_is_noiommu() and remove driver modes from struct rte_kernel_driver.
>>>> >> >>
>>>> >> >> I don't think calling pci_vfio_no_iommu() inside
>>>> >> >> virtio_reg_rd/wr_1/2/3() would be a good idea.
>>>> >> >
>>>> >> > Why? The value may be cached in the priv properties.
>>>> >> >
>>>> >> pci_vfio_is_noiommu() parses /sys for
>>>> >> - enable_noiommu param
>>>> >> - attached driver name is vfio-noiommu or not.
>>>> >>
>>>> >> It does file operation for that, I meant to say that calling this api
>>>> >> within register_rd/wr function is not correct. It would be better if
>>>> >> those low level register_rd/wr api only checks driver_types.
>>>> >
>>>> > Yes, that's why I said the return of pci_vfio_is_noiommu() may be cached
>>>> > to keep efficiency.
>>>>
>>>> I am not convinced though, Still find pmd driver checking driver_types
>>>> using drv->kdrv is better approach than introducing a new global
>>>> variable which may look something like;
>>>
>>> Not a global variable. A function in EAL layer. A variable in PMD priv.
>>>
>>
>> If we agreed to use condition (drv->kdrv == RTE_KDRV_VFIO);
>> then resource parsing for vfio {including vfio and vfio_noiommu both
>> case} is enforced in virtio pmd driver layer and that is contradicting
>> to what we agreed earlier in this[1] thread. Also we don't need a
>> function in EAL layer or a variable in PMD priv. Perhaps a private
>> function in virtio pmd which does parsing for vfio interface.
>>
>> Thoughts?
>>
>> [1] http://dpdk.org/dev/patchwork/patch/9862/
>>
>
> Any comment/feedback on above approach?
>

Since approach in this patch (i.e.. _noiommu suffix) is blocking patch
series acceptance, I revisited approach keeping concern raised by
Thomas/David in mind, So to summarize thread discussion;

1. virtio currently works for vfio+noiommu and likely will work for
vfio+iommu in near future.
2. So remove __noiommu suffix and always use default.
3. Introduce vfio resource parsing global function, That function
suppose to do parsing for default vfio case and for vfio-noiommu case.
This function will be used by pmd drivers for resource parsing purpose
example virtio.

Yuan won't be happy with 3) I guess, because he wanted to get rid of
interface parsing from pmd driver.

Thomas, if 1/2/3/ addresses your concern then I'll spin the series,

Thanks.



>>>> At pci_eal layer ----
>>>> bool vfio_mode;
>>>> vfio_mode = pci_vfio_is_noiommu();
>>>>
>>>> At virtio pmd driver layer ----
>>>> Checking value at vfio_mode variable before doing virtio_rd/wr for
>>>> vfio interface.
>>>>
>>>> Instead virtio pmd driver doing
>>>>
>>>> virtio_reg_rd/wr_1/2/4()
>>>> {
>>>> if (drv->kdrv == VFIO)
>>>>       do pread()/pwrite()
>>>> else
>>>>       in()/out()
>>>> }
>>>>
>>>> is better approach.
>>>>
>>>> Let me know if you still think former is better than latter then I'll
>>>> send patch revision right-away.
>>>
>>>


More information about the dev mailing list