[dpdk-dev] [PATCH v3 05/11] bus: get iommu class

Hemant Agrawal hemant.agrawal at nxp.com
Fri Jul 14 12:51:02 CEST 2017


On 7/14/2017 3:59 PM, santosh wrote:
> On Friday 14 July 2017 03:52 PM, santosh wrote:
>
>> On Friday 14 July 2017 03:09 PM, Hemant Agrawal wrote:
>>
>>> On 7/14/2017 2:00 PM, santosh wrote:
>>>> On Friday 14 July 2017 01:37 PM, Hemant Agrawal wrote:
>>>>
>>>>> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>>>>>> API(rte_bus_get_iommu_class) helps to automatically detect and select
>>>>>> appropriate iova mapping scheme for iommu capable device on that bus.
>>>>>>
>>>>>> Algorithm for iova scheme selection for bus:
>>>>>> 0. Iterate through bus_list.
>>>>>> 1. Collect each bus iova mode value and update into 'mode' var.
>>>>>> 2. Here value '1' is _pa and value '2' is _va mode.
>>>>>> So mode selection scheme is like:
>>>>>> if mode == 2 then iova mode is _va.
>>>>>> if mode == 1 then iova mode is _pa
>>>>>> if mode  == 3 then iova mode ia _pa.
>>>>>>
>>>>>> So mode !=2  will be default iova mode.
>>>>>>
>>>>>> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
>>>>>> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>>>>>> ---
>>>>>>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  1 +
>>>>>>  lib/librte_eal/common/eal_common_bus.c          | 23 +++++++++++++++++++++++
>>>>>>  lib/librte_eal/common/eal_common_pci.c          |  1 +
>>>>>>  lib/librte_eal/common/include/rte_bus.h         | 22 ++++++++++++++++++++++
>>>>>>  lib/librte_eal/linuxapp/eal/rte_eal_version.map |  1 +
>>>>>>  5 files changed, 48 insertions(+)
>>>>>>
>>>>>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>> index 33c2c32c0..a2dd65a33 100644
>>>>>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>>>>>> @@ -202,6 +202,7 @@ DPDK_17.08 {
>>>>>>      rte_bus_find_by_name;
>>>>>>      rte_pci_match;
>>>>>>      rte_pci_get_iommu_class;
>>>>>> +    rte_bus_get_iommu_class;
>>>>>>
>>>>>>  } DPDK_17.05;
>>>>>>
>>>>>> diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
>>>>>> index 08bec2d93..5d5753ac9 100644
>>>>>> --- a/lib/librte_eal/common/eal_common_bus.c
>>>>>> +++ b/lib/librte_eal/common/eal_common_bus.c
>>>>>> @@ -222,3 +222,26 @@ rte_bus_find_by_device_name(const char *str)
>>>>>>          c[0] = '\0';
>>>>>>      return rte_bus_find(NULL, bus_can_parse, name);
>>>>>>  }
>>>>>> +
>>>>>> +
>>>>>> +/*
>>>>>> + * Get iommu class of devices on the bus.
>>>>>> + */
>>>>>> +enum rte_iova_mode
>>>>>> +rte_bus_get_iommu_class(void)
>>>>>> +{
>>>>>> +    int mode = 0;
>>>>>> +    struct rte_bus *bus;
>>>>>> +
>>>>>> +    TAILQ_FOREACH(bus, &rte_bus_list, next) {
>>>>>> +
>>>>>> +        if (bus->get_iommu_class)
>>>>>> +            mode |= bus->get_iommu_class();
>>>>>> +    }
>>>>>> +
>>>>> If you change the default return as '0' for buses. This code will work.
>>>>> e.g. PCI will return '0' - when no device is probed. FSL MC will return VA. the default mode will be 'VA'
>>>>>
>>>> I'm confused why it won't work for fslmc case?
>>>>
>>>> Let me walk through the code:
>>>>
>>>> If no-pci device Or (future) no-platform device probed then bus opt
>>>> to use default mapping scheme .. which is iova_pa(default scheme).
>>>>
>>>> Lets take PCI_bus example:
>>>> bus->get_iommu_class()
>>>>     ---> bus->_pci_get_iommu_class()
>>>>         * Now consider that no interface bound to any of PCI device, then
>>>>           it will return RTE_IOVA_PA mode to rte_bus layer (aka bus->get_iommu_class).
>>>>           So the iova mapping result from iommu_class scan is RTE_IOVA_PA (default).
>>>>           It works for PCI_bus case, tested for both iova_va and iova_pa case, no-pci device case.
>>>>
>>>> Now in fslmc bus case:
>>>> bus->get_iommu_class()
>>>>     ---> bus->_fslmc_get_iommu_class()
>>>>
>>>>         * IIUC your comment - You want fslmc bus to return RTE_IOVA_VA if no device
>>>>           detected, Right?
>>> why?
>>>
>> As I didn't understood your previous reply:
>> `e.g. PCI will return '0' - when no device is probed. FSL MC will return VA. the default mode will be 'VA'`
>>
>> So, I'm asking you that in fslmc bus case - if no device found then are you opting _va scheme or not?
>> Seems like _not_ per your below comment.
>>
>>
>>> If bus is just present but no device is in use for dpdk, then bus should return 0 and it *should not* participate in the IOMMU class decision.
>>>
>> I think, I understand your point..Example if you have no-pci on first PCI bus
>> but device found on 2nd platform bus then you don't want to fallback to default (/_pa) mode..
>> instead you want to use 2nd bus mode for mapping, which is _va. Right?
>>
>> If so then In my first version - We did introduced the case called _DC.
>> _DC:0 --> stands for no-device found case.
>>
>>> Right now there are only two buses. There can be more buses. (e.g. PCI, platform, fslmc in case of dpaa2 as well).
>>>
>>> If the bus is not being used at all, why it influence the decision of other buses.
>>>
>> If your referring to above case then I agree, We'll re-introduce _DC state from v1 in next revision.
>> That will look like
>> rte_pci_get_iommu_class() {
>> 	int mode = RTE_IOVA_DC; /* '0' */
>>
>> 	return _DC; /* if no device found */
>> }
>>
>> Right?

Yes! Thanks!

As I explained in the other thread. The PCI devices can be there, but 
none of them is for DPDK:
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL:   probe driver: 8086:10d3 net_e1000_em
EAL:   Not managed by a supported kernel driver, skipped


>>
>>> if no bus has any device, the System default is anyway PA.
>>>
>> Right, If no bus present then It's also responsibility of `rte_bus_get_iommu_class`
>> to use default mapping scheme which is _pa and which It does.
>>
>>>>           if so then your fslmc bus handle should do something like below
>>>>             -- If no device on fslmc bus : return RTE_IOVA_VA.
>>>>             -- If device detected on fslmc bus and bound to iommu driver : return RTE_IOVA_VA
>>>>             -- If device detected fslmc but not bound to iommu drv : return RTE_IOVA_PA..
>>>>
>>>> make sense? If not then can you describe fslmc mapping scheme?
>>>>
>>>>> if fslmc is not present. The default mode will be PA.
>>>>>
>>>>>> +    if (mode != RTE_IOVA_VA) {
>>>>>> +        /* Use default IOVA mode */
>>>>>> +        mode = RTE_IOVA_PA;
>>>>>> +    }
>>> The system default is anyway PA.
>>>
>> No, That check is needed for case like 1st bus return with _PA and 2nd bus returns with _VA,
>> then mode = 3 (Mix mode), which we don't support so (as I mentioned before) its responsibility of
>> rte_bus_get_iommu_class() to return default mode (_pa). That's why!.
>>
>>
> Does your platform supports `mix mode`, I asked same question in thread [04/11] too?
> Let's say that dpaa2 supports mix mode then it is Ok if bus chose to opt default mapping
> for mix mode case? Do you see any issue if bus opt to use default scheme for mix mode?
>
>

yes! We can support mix mode. However with your suggested changes in 
mempool etc APIs, now the DPDK will not work for us in mix mode (when 
both PCI and DPAA2 devices are available) with VA support only for DPAA2 :)

In case of mix mode, you logic is already there to default to PA. That 
is fine.

But, when PCI devices are not hooked to dpdk. We should be able to use 
VA for dpaa2.



More information about the dev mailing list