[v2,1/4] igb_uio: add wc option
Checks
Commit Message
Write combining (WC) increases NIC performance by making better
utilization of PCI bus, but cannot be use by all PMDs.
To get internal_addr memory need to be mapped. But as memory could not be
mapped twice: with and without WC, it should be skipped for WC. [1]
To do not spoil other drivers that potentially could use internal_addr,
parameter wc_activate adds possibility to skip it for those PMDs,
that do not use it.
[1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
section 5.3 and 5.4
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
Comments
On 6/28/2018 2:15 PM, Rafal Kozik wrote:
> Write combining (WC) increases NIC performance by making better
> utilization of PCI bus, but cannot be use by all PMDs.
>
> To get internal_addr memory need to be mapped. But as memory could not be
> mapped twice: with and without WC, it should be skipped for WC. [1]
>
> To do not spoil other drivers that potentially could use internal_addr,
> parameter wc_activate adds possibility to skip it for those PMDs,
> that do not use it.
>
> [1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
> section 5.3 and 5.4
Hi Rafal,
Thank you for more information but I have a few more question:
- What do you mean "But as memory could not be mapped twice: with and without WC"?
ioremap() maps the physical address for kernel usage, and via uio we are mapping
it to userspace, do you mean these two?
- "internal_addr" is should be for kernel sage not for DPDK drivers which are in
the userspace, why it is a concern for us?
- What happens if you don't update this code at all? Won't you able to map
device address into userspace?
I tested adding RTE_PCI_DRV_WC_ACTIVATE to i40e, on top of your patch, and able
to map without igb_uio update.
I am not able to understand need of the modification.
>
> Signed-off-by: Rafal Kozik <rk@semihalf.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
> index b3233f1..3382fb1 100644
> --- a/kernel/linux/igb_uio/igb_uio.c
> +++ b/kernel/linux/igb_uio/igb_uio.c
> @@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
> int refcnt;
> };
>
> +static int wc_activate;
> static char *intr_mode;
> static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
> /* sriov sysfs */
> @@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
> len = pci_resource_len(dev, pci_bar);
> if (addr == 0 || len == 0)
> return -1;
> - internal_addr = ioremap(addr, len);
> - if (internal_addr == NULL)
> - return -1;
> + if (wc_activate == 0) {
> + internal_addr = ioremap(addr, len);
> + if (internal_addr == NULL)
> + return -1;
> + } else {
> + internal_addr = NULL;
> + }
> info->mem[n].name = name;
> info->mem[n].addr = addr;
> info->mem[n].internal_addr = internal_addr;
> @@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
> " " RTE_INTR_MODE_LEGACY_NAME " Use Legacy interrupt\n"
> "\n");
>
> +module_param(wc_activate, int, 0);
> +MODULE_PARM_DESC(wc_activate,
> +"Activate support for write combining (WC) (default=0)\n"
> +" 0 - disable\n"
> +" other - enable\n");
> +
> MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
> MODULE_LICENSE("GPL");
> MODULE_AUTHOR("Intel Corporation");
>
2018-06-28 16:32 GMT+02:00 Ferruh Yigit <ferruh.yigit@intel.com>:
> On 6/28/2018 2:15 PM, Rafal Kozik wrote:
>> Write combining (WC) increases NIC performance by making better
>> utilization of PCI bus, but cannot be use by all PMDs.
>>
>> To get internal_addr memory need to be mapped. But as memory could not be
>> mapped twice: with and without WC, it should be skipped for WC. [1]
>>
>> To do not spoil other drivers that potentially could use internal_addr,
>> parameter wc_activate adds possibility to skip it for those PMDs,
>> that do not use it.
>>
>> [1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
>> section 5.3 and 5.4
>
> Hi Rafal,
>
> Thank you for more information but I have a few more question:
>
> - What do you mean "But as memory could not be mapped twice: with and without WC"?
>
> ioremap() maps the physical address for kernel usage, and via uio we are mapping
> it to userspace, do you mean these two?
>
> - "internal_addr" is should be for kernel sage not for DPDK drivers which are in
> the userspace, why it is a concern for us?
>
> - What happens if you don't update this code at all? Won't you able to map
> device address into userspace?
> I tested adding RTE_PCI_DRV_WC_ACTIVATE to i40e, on top of your patch, and able
> to map without igb_uio update.
> I am not able to understand need of the modification.
>
Hello Ferruh,
I was not precisely. Memory could be mapped multiple time,
but cannot be mapped with and without WC support simultaneously.
When not setting wc_activate memory mapping work, but silently
fall-back to non prefetchable mode.
I perform measurements of writing speed.
When parameter wc_activate was set I get 4.81 GB/s.
Without this parameter result was 0.07 GB/s.
Code used for testing is located here:
gist.github.com/semihalf-kozik-rafal/327208cd52a2fac2d12250028becf9b3
Best regards,
Rafal
>>
>> Signed-off-by: Rafal Kozik <rk@semihalf.com>
>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>> ---
>> kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
>> 1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
>> index b3233f1..3382fb1 100644
>> --- a/kernel/linux/igb_uio/igb_uio.c
>> +++ b/kernel/linux/igb_uio/igb_uio.c
>> @@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
>> int refcnt;
>> };
>>
>> +static int wc_activate;
>> static char *intr_mode;
>> static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
>> /* sriov sysfs */
>> @@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
>> len = pci_resource_len(dev, pci_bar);
>> if (addr == 0 || len == 0)
>> return -1;
>> - internal_addr = ioremap(addr, len);
>> - if (internal_addr == NULL)
>> - return -1;
>> + if (wc_activate == 0) {
>> + internal_addr = ioremap(addr, len);
>> + if (internal_addr == NULL)
>> + return -1;
>> + } else {
>> + internal_addr = NULL;
>> + }
>> info->mem[n].name = name;
>> info->mem[n].addr = addr;
>> info->mem[n].internal_addr = internal_addr;
>> @@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
>> " " RTE_INTR_MODE_LEGACY_NAME " Use Legacy interrupt\n"
>> "\n");
>>
>> +module_param(wc_activate, int, 0);
>> +MODULE_PARM_DESC(wc_activate,
>> +"Activate support for write combining (WC) (default=0)\n"
>> +" 0 - disable\n"
>> +" other - enable\n");
>> +
>> MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
>> MODULE_LICENSE("GPL");
>> MODULE_AUTHOR("Intel Corporation");
>>
>
On 6/29/2018 9:35 AM, Rafał Kozik wrote:
> 2018-06-28 16:32 GMT+02:00 Ferruh Yigit <ferruh.yigit@intel.com>:
>> On 6/28/2018 2:15 PM, Rafal Kozik wrote:
>>> Write combining (WC) increases NIC performance by making better
>>> utilization of PCI bus, but cannot be use by all PMDs.
>>>
>>> To get internal_addr memory need to be mapped. But as memory could not be
>>> mapped twice: with and without WC, it should be skipped for WC. [1]
>>>
>>> To do not spoil other drivers that potentially could use internal_addr,
>>> parameter wc_activate adds possibility to skip it for those PMDs,
>>> that do not use it.
>>>
>>> [1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
>>> section 5.3 and 5.4
>>
>> Hi Rafal,
>>
>> Thank you for more information but I have a few more question:
>>
>> - What do you mean "But as memory could not be mapped twice: with and without WC"?
>>
>> ioremap() maps the physical address for kernel usage, and via uio we are mapping
>> it to userspace, do you mean these two?
>>
>> - "internal_addr" is should be for kernel sage not for DPDK drivers which are in
>> the userspace, why it is a concern for us?
>>
>> - What happens if you don't update this code at all? Won't you able to map
>> device address into userspace?
>> I tested adding RTE_PCI_DRV_WC_ACTIVATE to i40e, on top of your patch, and able
>> to map without igb_uio update.
>> I am not able to understand need of the modification.
>>
>
> Hello Ferruh,
>
> I was not precisely. Memory could be mapped multiple time,
> but cannot be mapped with and without WC support simultaneously.
> When not setting wc_activate memory mapping work, but silently
> fall-back to non prefetchable mode.
How can I confirm this silently fall-back behavior, is there any log can I turn
on in kernel or anything from proc/sysfs?
>
> I perform measurements of writing speed.
> When parameter wc_activate was set I get 4.81 GB/s.
> Without this parameter result was 0.07 GB/s.
> Code used for testing is located here:
> gist.github.com/semihalf-kozik-rafal/327208cd52a2fac2d12250028becf9b3
>
> Best regards,
> Rafal
>
>>>
>>> Signed-off-by: Rafal Kozik <rk@semihalf.com>
>>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>>> ---
>>> kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
>>> 1 file changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
>>> index b3233f1..3382fb1 100644
>>> --- a/kernel/linux/igb_uio/igb_uio.c
>>> +++ b/kernel/linux/igb_uio/igb_uio.c
>>> @@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
>>> int refcnt;
>>> };
>>>
>>> +static int wc_activate;
>>> static char *intr_mode;
>>> static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
>>> /* sriov sysfs */
>>> @@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
>>> len = pci_resource_len(dev, pci_bar);
>>> if (addr == 0 || len == 0)
>>> return -1;
>>> - internal_addr = ioremap(addr, len);
>>> - if (internal_addr == NULL)
>>> - return -1;
>>> + if (wc_activate == 0) {
>>> + internal_addr = ioremap(addr, len);
>>> + if (internal_addr == NULL)
>>> + return -1;
>>> + } else {
>>> + internal_addr = NULL;
>>> + }
>>> info->mem[n].name = name;
>>> info->mem[n].addr = addr;
>>> info->mem[n].internal_addr = internal_addr;
>>> @@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
>>> " " RTE_INTR_MODE_LEGACY_NAME " Use Legacy interrupt\n"
>>> "\n");
>>>
>>> +module_param(wc_activate, int, 0);
>>> +MODULE_PARM_DESC(wc_activate,
>>> +"Activate support for write combining (WC) (default=0)\n"
>>> +" 0 - disable\n"
>>> +" other - enable\n");
>>> +
>>> MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
>>> MODULE_LICENSE("GPL");
>>> MODULE_AUTHOR("Intel Corporation");
>>>
>>
@@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
int refcnt;
};
+static int wc_activate;
static char *intr_mode;
static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
/* sriov sysfs */
@@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
len = pci_resource_len(dev, pci_bar);
if (addr == 0 || len == 0)
return -1;
- internal_addr = ioremap(addr, len);
- if (internal_addr == NULL)
- return -1;
+ if (wc_activate == 0) {
+ internal_addr = ioremap(addr, len);
+ if (internal_addr == NULL)
+ return -1;
+ } else {
+ internal_addr = NULL;
+ }
info->mem[n].name = name;
info->mem[n].addr = addr;
info->mem[n].internal_addr = internal_addr;
@@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
" " RTE_INTR_MODE_LEGACY_NAME " Use Legacy interrupt\n"
"\n");
+module_param(wc_activate, int, 0);
+MODULE_PARM_DESC(wc_activate,
+"Activate support for write combining (WC) (default=0)\n"
+" 0 - disable\n"
+" other - enable\n");
+
MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Intel Corporation");