[dpdk-dev,v9,2/9] eal/pci: get iommu class

Message ID 20170920112356.17629-3-santosh.shukla@caviumnetworks.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Santosh Shukla Sept. 20, 2017, 11:23 a.m. UTC
  Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.

Patch also adds rte_pci_get_iommu_class definition for:
- bsdapp: api returns default iova mode.
- linuxapp: Has stub implementation, Followup patch has complete
  implementation.

Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
v8 --> v9:
- Added linuxapp iova stub definition (Suugested by Anatoly)

v6 --> v7:
- squashed v6 series patch [02/12] & [03/12] (Aaron comment).

 lib/librte_eal/bsdapp/eal/eal_pci.c             | 10 ++++++++++
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  1 +
 lib/librte_eal/common/include/rte_bus.h         | 10 ++++++++++
 lib/librte_eal/common/include/rte_pci.h         | 11 +++++++++++
 lib/librte_eal/linuxapp/eal/eal_pci.c           |  9 +++++++++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  1 +
 6 files changed, 42 insertions(+)
  

Comments

Burakov, Anatoly Sept. 20, 2017, 11:39 a.m. UTC | #1
On 20-Sep-17 12:23 PM, Santosh Shukla wrote:
> Introducing rte_pci_get_iommu_class API which helps to get iommu class
> of PCI device on the bus and returns preferred iova mapping mode for
> PCI bus.
> 
> Patch also adds rte_pci_get_iommu_class definition for:
> - bsdapp: api returns default iova mode.
> - linuxapp: Has stub implementation, Followup patch has complete
>    implementation.
> 
> Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---

Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
  
Thomas Monjalon Oct. 5, 2017, 11:58 p.m. UTC | #2
This patch is introducing a new abstraction.
It is important to explain it for future readers of this code.

20/09/2017 13:23, Santosh Shukla:
> +/**
> + * IOVA mapping mode.
> + */

Please explain what IOVA means and what is the purpose of
distinguish the different modes.

> +enum rte_iova_mode {
> +	RTE_IOVA_DC = 0,	/* Don't care mode */
> +	RTE_IOVA_PA = (1 << 0),
> +	RTE_IOVA_VA = (1 << 1)
> +};

You should explain each value of the enum.
  
Santosh Shukla Oct. 6, 2017, 3:04 a.m. UTC | #3
Thomas,

You comment is annoying and infuriating both.
Patch is their for more than 4month, had enough time for you to comment
and understand the topic. Thorough review and testing has happened both.

NOTE: You have already delayed this series by one release and
I'm guessing that you intent to push by one more, if you had such
mundane question then why not ask before? Make me think that you are
wasting my time and effort both.

On Friday 06 October 2017 05:28 AM, Thomas Monjalon wrote:

> This patch is introducing a new abstraction.
> It is important to explain it for future readers of this code.

If you don't know - What is iova? How to program iova?
purpose of iova then should read and educate your know - how first.

Yes, its is introducing new abstraction, because dpdk from
ancient days does only one programming mode aka iova=pa.

note:You were still using iova mode as _pa (and didn't care to ask yourself about IOVA!)
which is one of iova mode too!.

However, IOMMU can also generate _va address too called iova=_va mode..
which is also correct/viable/applicable/Okiesh programming mode
for iommu capable HW like dma for example(Note again,.. AGNOSTIC behavior of iommu).

Now Why dpdk needs to understand IOVA programming philosophy:

Though DPDK was _silenty_ using iova as pa mode but then there
is a need arise to make mapping mode explicit and for that we need
abstraction since there wasn't one existed.

Reason:
Because From last few years,.ONA participants like Cavium, nxp
added ARM arch support in dpdk and included drivers for their HW.. 
and their hw has use-case (example external mempool), such a way that
programming those HW in iova as va mode would save cycle in fast path
(this part, we explained so many-1000 time in series and same understood by reviewer)
thus its vital to introduce iova infra in dpdk.

Same applicable for intel HW blocks too. Its works for intel too!

> 20/09/2017 13:23, Santosh Shukla:
>> +/**
>> + * IOVA mapping mode.
>> + */
> Please explain what IOVA means and what is the purpose of
> distinguish the different modes.
>
IOVA mapping mode is device aka iommu programming mode by which
HW(iommu) will generate _pa or _va address accordingly. 

>> +enum rte_iova_mode {
>> +	RTE_IOVA_DC = 0,	/* Don't care mode */
>> +	RTE_IOVA_PA = (1 << 0),
>> +	RTE_IOVA_VA = (1 << 1)
>> +};
> You should explain each value of the enum.

Aren't naming choice for each member of enum is self-explanatory?
I don't find logic anymore in your question? are you asking about side commenting?
if not then IFAIU, you question is basically about what is _pa and _va? if so then
reader should have little know-how before they intent to do fast-path programming.
Author can't write whole IOMMU spec for reader sake. Those are minute and mundate info
incase any user want to program device in _pa or _va. I'm at loss with you question,
I don;t see logic and it is frustrating to me. You had enough time for all this
in case you had really cared,, we have series for external PMD and drivers waiting
for iova infra, I see it a your move nothing bu blocking ONA series progress
Don;t you trust Reviewer in case you have hard time understaing topic and that
makese me to ask - Are you willing to accept this feature or not? if not then
I'm wasting my energy on it.

Thanks.
  
Thomas Monjalon Oct. 6, 2017, 7:24 a.m. UTC | #4
06/10/2017 05:04, santosh:
> Thomas,
> 
> You comment is annoying and infuriating both.
> Patch is their for more than 4month, had enough time for you to comment
> and understand the topic. Thorough review and testing has happened both.
> 
> NOTE: You have already delayed this series by one release and
> I'm guessing that you intent to push by one more, if you had such
> mundane question then why not ask before? Make me think that you are
> wasting my time and effort both.

You misunderstand me.
My intent is to push this patch.
A lot of people have reviewed it during this cycle.
I was just looking for wording details in order to ease people
when they will see this abstraction in the code base.

> On Friday 06 October 2017 05:28 AM, Thomas Monjalon wrote:
> 
> > This patch is introducing a new abstraction.
> > It is important to explain it for future readers of this code.
> 
> If you don't know - What is iova? How to program iova?
> purpose of iova then should read and educate your know - how first.
> 
> Yes, its is introducing new abstraction, because dpdk from
> ancient days does only one programming mode aka iova=pa.
> 
> note:You were still using iova mode as _pa (and didn't care to ask yourself about IOVA!)
> which is one of iova mode too!.
> 
> However, IOMMU can also generate _va address too called iova=_va mode..
> which is also correct/viable/applicable/Okiesh programming mode
> for iommu capable HW like dma for example(Note again,.. AGNOSTIC behavior of iommu).
> 
> Now Why dpdk needs to understand IOVA programming philosophy:
> 
> Though DPDK was _silenty_ using iova as pa mode but then there
> is a need arise to make mapping mode explicit and for that we need
> abstraction since there wasn't one existed.
> 
> Reason:
> Because From last few years,.ONA participants like Cavium, nxp
> added ARM arch support in dpdk and included drivers for their HW.. 
> and their hw has use-case (example external mempool), such a way that
> programming those HW in iova as va mode would save cycle in fast path
> (this part, we explained so many-1000 time in series and same understood by reviewer)
> thus its vital to introduce iova infra in dpdk.
> 
> Same applicable for intel HW blocks too. Its works for intel too!

I know all of that!
I was just thinking that you could add more explanations somewhere
in the code or the doc.

> > 20/09/2017 13:23, Santosh Shukla:
> >> +/**
> >> + * IOVA mapping mode.
> >> + */
> > Please explain what IOVA means and what is the purpose of
> > distinguish the different modes.
> >
> IOVA mapping mode is device aka iommu programming mode by which
> HW(iommu) will generate _pa or _va address accordingly.

In this doxygen block, it would be the right place to explain how the
IOVA mode will impact the rest of DPDK.

> >> +enum rte_iova_mode {
> >> +	RTE_IOVA_DC = 0,	/* Don't care mode */
> >> +	RTE_IOVA_PA = (1 << 0),
> >> +	RTE_IOVA_VA = (1 << 1)
> >> +};
> > You should explain each value of the enum.
> 
> Aren't naming choice for each member of enum is self-explanatory?
> I don't find logic anymore in your question? are you asking about side commenting?
> if not then IFAIU, you question is basically about what is _pa and _va? if so then
> reader should have little know-how before they intent to do fast-path programming.
> Author can't write whole IOMMU spec for reader sake. Those are minute and mundate info
> incase any user want to program device in _pa or _va. I'm at loss with you question,
> I don;t see logic and it is frustrating to me. You had enough time for all this
> in case you had really cared,, we have series for external PMD and drivers waiting
> for iova infra, I see it a your move nothing bu blocking ONA series progress
> Don;t you trust Reviewer in case you have hard time understaing topic and that
> makese me to ask - Are you willing to accept this feature or not? if not then
> I'm wasting my energy on it.

Santosh, I'm sorry if you don't understand that I was just asking for
a bit more doc.
You could just add something like
	/* DMA using physical address */
	/* DMA using virtual address */

Anyway, if you don't want to add any explanation, it won't prevent
pushing this patch.
  
Santosh Shukla Oct. 6, 2017, 9:13 a.m. UTC | #5
On Friday 06 October 2017 12:54 PM, Thomas Monjalon wrote:
> 06/10/2017 05:04, santosh:
>> Thomas,
>>
>> You comment is annoying and infuriating both.
>> Patch is their for more than 4month, had enough time for you to comment
>> and understand the topic. Thorough review and testing has happened both.
>>
>> NOTE: You have already delayed this series by one release and
>> I'm guessing that you intent to push by one more, if you had such
>> mundane question then why not ask before? Make me think that you are
>> wasting my time and effort both.
> You misunderstand me.
> My intent is to push this patch.
> A lot of people have reviewed it during this cycle.
> I was just looking for wording details in order to ease people
> when they will see this abstraction in the code base.
>
>> On Friday 06 October 2017 05:28 AM, Thomas Monjalon wrote:
>>
>>> This patch is introducing a new abstraction.
>>> It is important to explain it for future readers of this code.
>> If you don't know - What is iova? How to program iova?
>> purpose of iova then should read and educate your know - how first.
>>
>> Yes, its is introducing new abstraction, because dpdk from
>> ancient days does only one programming mode aka iova=pa.
>>
>> note:You were still using iova mode as _pa (and didn't care to ask yourself about IOVA!)
>> which is one of iova mode too!.
>>
>> However, IOMMU can also generate _va address too called iova=_va mode..
>> which is also correct/viable/applicable/Okiesh programming mode
>> for iommu capable HW like dma for example(Note again,.. AGNOSTIC behavior of iommu).
>>
>> Now Why dpdk needs to understand IOVA programming philosophy:
>>
>> Though DPDK was _silenty_ using iova as pa mode but then there
>> is a need arise to make mapping mode explicit and for that we need
>> abstraction since there wasn't one existed.
>>
>> Reason:
>> Because From last few years,.ONA participants like Cavium, nxp
>> added ARM arch support in dpdk and included drivers for their HW.. 
>> and their hw has use-case (example external mempool), such a way that
>> programming those HW in iova as va mode would save cycle in fast path
>> (this part, we explained so many-1000 time in series and same understood by reviewer)
>> thus its vital to introduce iova infra in dpdk.
>>
>> Same applicable for intel HW blocks too. Its works for intel too!
> I know all of that!
> I was just thinking that you could add more explanations somewhere
> in the code or the doc.
>
>>> 20/09/2017 13:23, Santosh Shukla:
>>>> +/**
>>>> + * IOVA mapping mode.
>>>> + */
>>> Please explain what IOVA means and what is the purpose of
>>> distinguish the different modes.
>>>
>> IOVA mapping mode is device aka iommu programming mode by which
>> HW(iommu) will generate _pa or _va address accordingly.

sending v10 with doc changes.

> In this doxygen block, it would be the right place to explain how the
> IOVA mode will impact the rest of DPDK.
>
>>>> +enum rte_iova_mode {
>>>> +	RTE_IOVA_DC = 0,	/* Don't care mode */
>>>> +	RTE_IOVA_PA = (1 << 0),
>>>> +	RTE_IOVA_VA = (1 << 1)
>>>> +};
>>> You should explain each value of the enum.
>> Aren't naming choice for each member of enum is self-explanatory?
>> I don't find logic anymore in your question? are you asking about side commenting?
>> if not then IFAIU, you question is basically about what is _pa and _va? if so then
>> reader should have little know-how before they intent to do fast-path programming.
>> Author can't write whole IOMMU spec for reader sake. Those are minute and mundate info
>> incase any user want to program device in _pa or _va. I'm at loss with you question,
>> I don;t see logic and it is frustrating to me. You had enough time for all this
>> in case you had really cared,, we have series for external PMD and drivers waiting
>> for iova infra, I see it a your move nothing bu blocking ONA series progress
>> Don;t you trust Reviewer in case you have hard time understaing topic and that
>> makese me to ask - Are you willing to accept this feature or not? if not then
>> I'm wasting my energy on it.
> Santosh, I'm sorry if you don't understand that I was just asking for
> a bit more doc.
> You could just add something like
> 	/* DMA using physical address */
> 	/* DMA using virtual address */

in v10.

Thanks.

> Anyway, if you don't want to add any explanation, it won't prevent
> pushing this patch.
  

Patch

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 04eacdcc7..e2c252320 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -403,6 +403,16 @@  rte_pci_scan(void)
 	return -1;
 }
 
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+	/* Supports only RTE_KDRV_NIC_UIO */
+	return RTE_IOVA_PA;
+}
+
 int
 pci_update_device(const struct rte_pci_addr *addr)
 {
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index cfbf8fbd0..c6ffd9399 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -243,5 +243,6 @@  DPDK_17.11 {
 	global:
 
 	rte_pci_match;
+	rte_pci_get_iommu_class;
 
 } DPDK_17.08;
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index c79368d3c..9e40687e5 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -55,6 +55,16 @@  extern "C" {
 /** Double linked list of buses */
 TAILQ_HEAD(rte_bus_list, rte_bus);
 
+
+/**
+ * IOVA mapping mode.
+ */
+enum rte_iova_mode {
+	RTE_IOVA_DC = 0,	/* Don't care mode */
+	RTE_IOVA_PA = (1 << 0),
+	RTE_IOVA_VA = (1 << 1)
+};
+
 /**
  * Bus specific scan for devices attached on the bus.
  * For each bus object, the scan would be responsible for finding devices and
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index eab84c7a4..0e36de093 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -381,6 +381,17 @@  int
 rte_pci_match(const struct rte_pci_driver *pci_drv,
 	      const struct rte_pci_device *pci_dev);
 
+
+/**
+ * Get iommu class of PCI devices on the bus.
+ * And return their preferred iova mapping mode.
+ *
+ * @return
+ *   - enum rte_iova_mode.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void);
+
 /**
  * Map the PCI device resources in user space virtual memory address
  *
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 8951ce742..26f2be822 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -487,6 +487,15 @@  rte_pci_scan(void)
 	return -1;
 }
 
+/*
+ * Get iommu class of pci devices on the bus.
+ */
+enum rte_iova_mode
+rte_pci_get_iommu_class(void)
+{
+	return RTE_IOVA_PA;
+}
+
 /* Read PCI config space. */
 int rte_pci_read_config(const struct rte_pci_device *device,
 		void *buf, size_t len, off_t offset)
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 287cc75cd..a8c8ea4f4 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -248,5 +248,6 @@  DPDK_17.11 {
 	global:
 
 	rte_pci_match;
+	rte_pci_get_iommu_class;
 
 } DPDK_17.08;