[dpdk-dev,v3,07/11] linuxapp/eal: auto detect iova mode

Message ID 20170711061631.5018-8-santosh.shukla@caviumnetworks.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Santosh Shukla July 11, 2017, 6:16 a.m. UTC
  - Moving late bus scanning to up..just after eal_parsing.
- Auto detect iova mapping mode, based on the result of
  rte_bus_scan_iommu_class.

Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)
  

Comments

Hemant Agrawal July 13, 2017, 11:29 a.m. UTC | #1
On 7/11/2017 11:46 AM, Santosh Shukla wrote:
> - Moving late bus scanning to up..just after eal_parsing.
> - Auto detect iova mapping mode, based on the result of
>   rte_bus_scan_iommu_class.
>
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 2546b55e4..7b4dd70de 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -799,6 +799,16 @@ rte_eal_init(int argc, char **argv)
>  		return -1;
>  	}
>
> +	if (rte_bus_scan()) {
> +		rte_eal_init_alert("Cannot scan the buses for devices\n");
> +		rte_errno = ENODEV;
> +		return -1;
> +	}
> +

The original place of the bus scan was with the following factors:
1. The bus scan requires the VFIO to be enabled atleast in dpaa2 case.
(VFIO code still need cleanup to be support non-pci cleanly). I tried 
moving it before bus_scan, this helped in bus scanning.

2. During SCAN, the bus may allocate memory to devices or for it's own 
usages.  rte_malloc or mempool is required in cases to support 
multi-process environment. (e.g. dpaa2 create dpbp or dpio device memory 
using the rte_malloc call).

Since none of the other rte library (mempool, memzone, tailq) is 
available at this point, it will create significant restriction on the 
bus scan.

We will prefer if you can re-introduce the "iova_mode" and allow the 
application choose, which mode it want to run.

This auto-detect logic may not work for many buses and it is going
to create serious restrictions on the bus_scan code.

> +	/* autodetect the iova mapping mode (default is iova_pa) */
> +	if (rte_bus_get_iommu_class() == RTE_IOVA_VA)
> +		rte_eal_get_configuration()->iova_mode = RTE_IOVA_VA;
> +
>  	if (internal_config.no_hugetlbfs == 0 &&
>  			internal_config.process_type != RTE_PROC_SECONDARY &&
>  			internal_config.xen_dom0_support == 0 &&
> @@ -896,12 +906,6 @@ rte_eal_init(int argc, char **argv)
>  		return -1;
>  	}
>
> -	if (rte_bus_scan()) {
> -		rte_eal_init_alert("Cannot scan the buses for devices\n");
> -		rte_errno = ENODEV;
> -		return -1;
> -	}
> -
>  	RTE_LCORE_FOREACH_SLAVE(i) {
>
>  		/*
>
  
Hemant Agrawal July 13, 2017, 11:45 a.m. UTC | #2
On 7/13/2017 4:59 PM, Hemant Agrawal wrote:
> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>> - Moving late bus scanning to up..just after eal_parsing.
>> - Auto detect iova mapping mode, based on the result of
>>   rte_bus_scan_iommu_class.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal.c | 16 ++++++++++------
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c
>> b/lib/librte_eal/linuxapp/eal/eal.c
>> index 2546b55e4..7b4dd70de 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -799,6 +799,16 @@ rte_eal_init(int argc, char **argv)
>>          return -1;
>>      }
>>
>> +    if (rte_bus_scan()) {
>> +        rte_eal_init_alert("Cannot scan the buses for devices\n");
>> +        rte_errno = ENODEV;
>> +        return -1;
>> +    }
>> +
>
> The original place of the bus scan was with the following factors:
> 1. The bus scan requires the VFIO to be enabled atleast in dpaa2 case.
> (VFIO code still need cleanup to be support non-pci cleanly). I tried
> moving it before bus_scan, this helped in bus scanning.
>
> 2. During SCAN, the bus may allocate memory to devices or for it's own
> usages.  rte_malloc or mempool is required in cases to support
> multi-process environment. (e.g. dpaa2 create dpbp or dpio device memory
> using the rte_malloc call).
>
> Since none of the other rte library (mempool, memzone, tailq) is
> available at this point, it will create significant restriction on the
> bus scan.
>
> We will prefer if you can re-introduce the "iova_mode" and allow the
> application choose, which mode it want to run.
>
> This auto-detect logic may not work for many buses and it is going
> to create serious restrictions on the bus_scan code.
>

Is it possible that you offer a *rte_bus_pre_scan* kind of infra to 
detect the bus iommu class only. This way it will address all the concerns.

>> +    /* autodetect the iova mapping mode (default is iova_pa) */
>> +    if (rte_bus_get_iommu_class() == RTE_IOVA_VA)
>> +        rte_eal_get_configuration()->iova_mode = RTE_IOVA_VA;
>> +
>>      if (internal_config.no_hugetlbfs == 0 &&
>>              internal_config.process_type != RTE_PROC_SECONDARY &&
>>              internal_config.xen_dom0_support == 0 &&
>> @@ -896,12 +906,6 @@ rte_eal_init(int argc, char **argv)
>>          return -1;
>>      }
>>
>> -    if (rte_bus_scan()) {
>> -        rte_eal_init_alert("Cannot scan the buses for devices\n");
>> -        rte_errno = ENODEV;
>> -        return -1;
>> -    }
>> -
>>      RTE_LCORE_FOREACH_SLAVE(i) {
>>
>>          /*
>>
>
>
>
  
Santosh Shukla July 13, 2017, 6:25 p.m. UTC | #3
On Thursday 13 July 2017 04:59 PM, Hemant Agrawal wrote:

> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>> - Moving late bus scanning to up..just after eal_parsing.
>> - Auto detect iova mapping mode, based on the result of
>>   rte_bus_scan_iommu_class.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal.c | 16 ++++++++++------
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>> index 2546b55e4..7b4dd70de 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -799,6 +799,16 @@ rte_eal_init(int argc, char **argv)
>>          return -1;
>>      }
>>
>> +    if (rte_bus_scan()) {
>> +        rte_eal_init_alert("Cannot scan the buses for devices\n");
>> +        rte_errno = ENODEV;
>> +        return -1;
>> +    }
>> +
>
> The original place of the bus scan was with the following factors:
> 1. The bus scan requires the VFIO to be enabled atleast in dpaa2 case.
> (VFIO code still need cleanup to be support non-pci cleanly). I tried moving it before bus_scan, this helped in bus scanning.
>
bus_scan should do scanning, device enumeration, detecting devices and 
interface that device bound to, that interface could be VFIO, UIO, UIO_GENERIC etc..

PCI bus scanning (in eal/) strictly comply to what I mentioned above, thus
aut-detection works gracefully.

However fslmc_bus 'scan' doesn't do device scanning, instead It call vfio dependent
code which ideally should fall in 'resource mapping' category,. ideally should
happen at bus probe time.

Example: 
rte_fslmc_bus_scan()
	--> fslmc_vfio_setup_group
	--> fslmc_vfio_process_group

So it is doing _setup_ inside scan ops, which in PCI(/vfio-pci) case happens 
at probe time (`vfio_setup_device`).

In order to benefit iova auto-detection infrastructure: fslmc bus should
do to two things:

0) fslmc bus scan should look at /sys/bus/platform/drivers/vfio-platform/*
and find out that devices bind to vfio-platform or not, if yes then update kdrv
entry mentioning interface type example VFIO. That-way flsmc bus gets capability to
inform rte_bus about IOMMU capable interface. Right now, existing implementation
don't have means to inform rte_bus about his devices like pci_bus has!.

1) defer the vfio_seup from scan to bus->probe().

 

> 2. During SCAN, the bus may allocate memory to devices or for it's own usages.  rte_malloc or mempool is required in cases to support multi-process environment. (e.g. dpaa2 create dpbp or dpio device memory using the rte_malloc call).
>
If bus scanning adheres to device detection or enumeration then rte_malloc/mempool

not required, Example eal/pci bus scanning.


And in fslmc bus case: if vfio_setup deferred at bus->probe time then
bus->scan won't have memory dependency.

> Since none of the other rte library (mempool, memzone, tailq) is available at this point, it will create significant restriction on the bus scan.
>
> We will prefer if you can re-introduce the "iova_mode" and allow the application choose, which mode it want to run.
>
> This auto-detect logic may not work for many buses and it is going
> to create serious restrictions on the bus_scan code.
>
fslmc is only bus besides PCI. Auto-detection works gracefully for PCI-bus.
Can you give a try to said proposal?

Ideally vfio-platform code should sit into eal/vfio like eal/vfio-pci is.
Otherwise it will keep creating problems for new generic framework like we're
discussing one.

if said proposal doesn't work for you then I will re-introduce iova-mode as
eal arg, that will override iova mapping mode. But IMHO, eal arg should be
intermediate solution. Once vfio-platform code properly re-factored and merged,
We should remove those eal iova-mode args.

>> +    /* autodetect the iova mapping mode (default is iova_pa) */
>> +    if (rte_bus_get_iommu_class() == RTE_IOVA_VA)
>> +        rte_eal_get_configuration()->iova_mode = RTE_IOVA_VA;
>> +
>>      if (internal_config.no_hugetlbfs == 0 &&
>>              internal_config.process_type != RTE_PROC_SECONDARY &&
>>              internal_config.xen_dom0_support == 0 &&
>> @@ -896,12 +906,6 @@ rte_eal_init(int argc, char **argv)
>>          return -1;
>>      }
>>
>> -    if (rte_bus_scan()) {
>> -        rte_eal_init_alert("Cannot scan the buses for devices\n");
>> -        rte_errno = ENODEV;
>> -        return -1;
>> -    }
>> -
>>      RTE_LCORE_FOREACH_SLAVE(i) {
>>
>>          /*
>>
>
>
  
Hemant Agrawal July 14, 2017, 8:49 a.m. UTC | #4
On 7/13/2017 11:55 PM, santosh wrote:
> On Thursday 13 July 2017 04:59 PM, Hemant Agrawal wrote:
>
>> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>>> - Moving late bus scanning to up..just after eal_parsing.
>>> - Auto detect iova mapping mode, based on the result of
>>>   rte_bus_scan_iommu_class.
>>>
>>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>>> ---
>>>  lib/librte_eal/linuxapp/eal/eal.c | 16 ++++++++++------
>>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>>> index 2546b55e4..7b4dd70de 100644
>>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>>> @@ -799,6 +799,16 @@ rte_eal_init(int argc, char **argv)
>>>          return -1;
>>>      }
>>>
>>> +    if (rte_bus_scan()) {
>>> +        rte_eal_init_alert("Cannot scan the buses for devices\n");
>>> +        rte_errno = ENODEV;
>>> +        return -1;
>>> +    }
>>> +
>>
>> The original place of the bus scan was with the following factors:
>> 1. The bus scan requires the VFIO to be enabled atleast in dpaa2 case.
>> (VFIO code still need cleanup to be support non-pci cleanly). I tried moving it before bus_scan, this helped in bus scanning.
>>
> bus_scan should do scanning, device enumeration, detecting devices and
> interface that device bound to, that interface could be VFIO, UIO, UIO_GENERIC etc..
>
> PCI bus scanning (in eal/) strictly comply to what I mentioned above, thus
> aut-detection works gracefully.
>
> However fslmc_bus 'scan' doesn't do device scanning, instead It call vfio dependent
> code which ideally should fall in 'resource mapping' category,. ideally should
> happen at bus probe time.
>
> Example:
> rte_fslmc_bus_scan()
> 	--> fslmc_vfio_setup_group
> 	--> fslmc_vfio_process_group
>
> So it is doing _setup_ inside scan ops, which in PCI(/vfio-pci) case happens
> at probe time (`vfio_setup_device`).
>
> In order to benefit iova auto-detection infrastructure: fslmc bus should
> do to two things:
>
> 0) fslmc bus scan should look at /sys/bus/platform/drivers/vfio-platform/*
> and find out that devices bind to vfio-platform or not, if yes then update kdrv
> entry mentioning interface type example VFIO. That-way flsmc bus gets capability to
> inform rte_bus about IOMMU capable interface. Right now, existing implementation
> don't have means to inform rte_bus about his devices like pci_bus has!.
>
vfio_fsl_mc is bit different from pci, we first get the resource 
container and then look for resources as children.

In any case, the reworking of the bus is pending since the support for 
many other features are being extended  for non-pci buses as well in 
dpdk e.g. devargs.
It is in my priority list to clean it up for next release.


> 1) defer the vfio_seup from scan to bus->probe().

This is a good suggestion. This can solve the initialization issue.

>
>
>
>> 2. During SCAN, the bus may allocate memory to devices or for it's own usages.  rte_malloc or mempool is required in cases to support multi-process environment. (e.g. dpaa2 create dpbp or dpio device memory using the rte_malloc call).
>>
> If bus scanning adheres to device detection or enumeration then rte_malloc/mempool
>
> not required, Example eal/pci bus scanning.
>
>
> And in fslmc bus case: if vfio_setup deferred at bus->probe time then
> bus->scan won't have memory dependency.
>
>> Since none of the other rte library (mempool, memzone, tailq) is available at this point, it will create significant restriction on the bus scan.
>>
>> We will prefer if you can re-introduce the "iova_mode" and allow the application choose, which mode it want to run.
>>
>> This auto-detect logic may not work for many buses and it is going
>> to create serious restrictions on the bus_scan code.
>>
> fslmc is only bus besides PCI. Auto-detection works gracefully for PCI-bus.
> Can you give a try to said proposal?
>
> Ideally vfio-platform code should sit into eal/vfio like eal/vfio-pci is.
> Otherwise it will keep creating problems for new generic framework like we're
> discussing one.
>
> if said proposal doesn't work for you then I will re-introduce iova-mode as
> eal arg, that will override iova mapping mode. But IMHO, eal arg should be
> intermediate solution. Once vfio-platform code properly re-factored and merged,
> We should remove those eal iova-mode args.

Thanks for digging into the fslmc code.  As I said, this is now my 
priority item to get the fslmc bus code refactored. We will target 
immediately after 17.08 validation.

However till then, we can only support the iova-mode.

>
>>> +    /* autodetect the iova mapping mode (default is iova_pa) */
>>> +    if (rte_bus_get_iommu_class() == RTE_IOVA_VA)
>>> +        rte_eal_get_configuration()->iova_mode = RTE_IOVA_VA;
>>> +
>>>      if (internal_config.no_hugetlbfs == 0 &&
>>>              internal_config.process_type != RTE_PROC_SECONDARY &&
>>>              internal_config.xen_dom0_support == 0 &&
>>> @@ -896,12 +906,6 @@ rte_eal_init(int argc, char **argv)
>>>          return -1;
>>>      }
>>>
>>> -    if (rte_bus_scan()) {
>>> -        rte_eal_init_alert("Cannot scan the buses for devices\n");
>>> -        rte_errno = ENODEV;
>>> -        return -1;
>>> -    }
>>> -
>>>      RTE_LCORE_FOREACH_SLAVE(i) {
>>>
>>>          /*
>>>
>>
>>
>
>
  
Santosh Shukla July 14, 2017, 9:21 a.m. UTC | #5
On Friday 14 July 2017 02:19 PM, Hemant Agrawal wrote:

> On 7/13/2017 11:55 PM, santosh wrote:
>> On Thursday 13 July 2017 04:59 PM, Hemant Agrawal wrote:
>>
>>> On 7/11/2017 11:46 AM, Santosh Shukla wrote:
>>>> - Moving late bus scanning to up..just after eal_parsing.
>>>> - Auto detect iova mapping mode, based on the result of
>>>>   rte_bus_scan_iommu_class.
>>>>
>>>> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
>>>> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>>>> ---
>>>>  lib/librte_eal/linuxapp/eal/eal.c | 16 ++++++++++------
>>>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>>>> index 2546b55e4..7b4dd70de 100644
>>>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>>>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>>>> @@ -799,6 +799,16 @@ rte_eal_init(int argc, char **argv)
>>>>          return -1;
>>>>      }
>>>>
>>>> +    if (rte_bus_scan()) {
>>>> +        rte_eal_init_alert("Cannot scan the buses for devices\n");
>>>> +        rte_errno = ENODEV;
>>>> +        return -1;
>>>> +    }
>>>> +
>>>
>>> The original place of the bus scan was with the following factors:
>>> 1. The bus scan requires the VFIO to be enabled atleast in dpaa2 case.
>>> (VFIO code still need cleanup to be support non-pci cleanly). I tried moving it before bus_scan, this helped in bus scanning.
>>>
>> bus_scan should do scanning, device enumeration, detecting devices and
>> interface that device bound to, that interface could be VFIO, UIO, UIO_GENERIC etc..
>>
>> PCI bus scanning (in eal/) strictly comply to what I mentioned above, thus
>> aut-detection works gracefully.
>>
>> However fslmc_bus 'scan' doesn't do device scanning, instead It call vfio dependent
>> code which ideally should fall in 'resource mapping' category,. ideally should
>> happen at bus probe time.
>>
>> Example:
>> rte_fslmc_bus_scan()
>>     --> fslmc_vfio_setup_group
>>     --> fslmc_vfio_process_group
>>
>> So it is doing _setup_ inside scan ops, which in PCI(/vfio-pci) case happens
>> at probe time (`vfio_setup_device`).
>>
>> In order to benefit iova auto-detection infrastructure: fslmc bus should
>> do to two things:
>>
>> 0) fslmc bus scan should look at /sys/bus/platform/drivers/vfio-platform/*
>> and find out that devices bind to vfio-platform or not, if yes then update kdrv
>> entry mentioning interface type example VFIO. That-way flsmc bus gets capability to
>> inform rte_bus about IOMMU capable interface. Right now, existing implementation
>> don't have means to inform rte_bus about his devices like pci_bus has!.
>>
> vfio_fsl_mc is bit different from pci, we first get the resource container and then look for resources as children.
>
> In any case, the reworking of the bus is pending since the support for many other features are being extended  for non-pci buses as well in dpdk e.g. devargs.
> It is in my priority list to clean it up for next release.
>
>
>> 1) defer the vfio_seup from scan to bus->probe().
>
> This is a good suggestion. This can solve the initialization issue.
>
>>
>>
>>
>>> 2. During SCAN, the bus may allocate memory to devices or for it's own usages.  rte_malloc or mempool is required in cases to support multi-process environment. (e.g. dpaa2 create dpbp or dpio device memory using the rte_malloc call).
>>>
>> If bus scanning adheres to device detection or enumeration then rte_malloc/mempool
>>
>> not required, Example eal/pci bus scanning.
>>
>>
>> And in fslmc bus case: if vfio_setup deferred at bus->probe time then
>> bus->scan won't have memory dependency.
>>
>>> Since none of the other rte library (mempool, memzone, tailq) is available at this point, it will create significant restriction on the bus scan.
>>>
>>> We will prefer if you can re-introduce the "iova_mode" and allow the application choose, which mode it want to run.
>>>
>>> This auto-detect logic may not work for many buses and it is going
>>> to create serious restrictions on the bus_scan code.
>>>
>> fslmc is only bus besides PCI. Auto-detection works gracefully for PCI-bus.
>> Can you give a try to said proposal?
>>
>> Ideally vfio-platform code should sit into eal/vfio like eal/vfio-pci is.
>> Otherwise it will keep creating problems for new generic framework like we're
>> discussing one.
>>
>> if said proposal doesn't work for you then I will re-introduce iova-mode as
>> eal arg, that will override iova mapping mode. But IMHO, eal arg should be
>> intermediate solution. Once vfio-platform code properly re-factored and merged,
>> We should remove those eal iova-mode args.
>
> Thanks for digging into the fslmc code.  As I said, this is now my priority item to get the fslmc bus code refactored. We will target immediately after 17.08 validation.
>
> However till then, we can only support the iova-mode.
>
Can you please try out said changes in your fslmc bus? and If it works for you then
we don't need to re-introduce iova-mode eal arg in future revision..
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 2546b55e4..7b4dd70de 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -799,6 +799,16 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
+	if (rte_bus_scan()) {
+		rte_eal_init_alert("Cannot scan the buses for devices\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
+
+	/* autodetect the iova mapping mode (default is iova_pa) */
+	if (rte_bus_get_iommu_class() == RTE_IOVA_VA)
+		rte_eal_get_configuration()->iova_mode = RTE_IOVA_VA;
+
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
 			internal_config.xen_dom0_support == 0 &&
@@ -896,12 +906,6 @@  rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_bus_scan()) {
-		rte_eal_init_alert("Cannot scan the buses for devices\n");
-		rte_errno = ENODEV;
-		return -1;
-	}
-
 	RTE_LCORE_FOREACH_SLAVE(i) {
 
 		/*