[PATCH] net/mlx5: enable PCI related counters

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Wed Feb 14 01:52:48 CET 2024



> On Feb 13, 2024, at 10:13 AM, Dariusz Sosnowski <dsosnowski at nvidia.com> wrote:
> 
>> -----Original Message-----
>> From: Stephen Hemminger <stephen at networkplumber.org>
>> Sent: Saturday, February 10, 2024 02:33
>> To: Wathsala Vithanage <wathsala.vithanage at arm.com>
>> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>;
>> Dariusz Sosnowski <dsosnowski at nvidia.com>; Slava Ovsiienko
>> <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; Suanming Mou
>> <suanmingm at nvidia.com>; Matan Azrad <matan at nvidia.com>;
>> dev at dpdk.org; nd at arm.com; Honnappa Nagarahalli
>> <honnappa.nagarahalli at arm.com>
>> Subject: Re: [PATCH] net/mlx5: enable PCI related counters
>> 
>> On Fri,  9 Feb 2024 20:41:42 +0000
>> Wathsala Vithanage <wathsala.vithanage at arm.com> wrote:
>> 
>>> Versions of Mellanox NICs starting from CX5 have device counters
>>> related to PCI. These counters are helpful in debugging IO
>>> bottlenecks. For instance, the outbound_pci_stalled_rd and
>>> outbound_pci_stalled_wr counters can help with identifying NIC stalls
>>> due to insufficient PCI credits, which otherwise would have required a
>>> PCI analyzer or a sophisticated PCI root port with a PMU.
>>> Currently none of these are available in the MLX5 PMD even though
>>> ethtool is capable of reading some of them.
>>> Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and reads
>>> via the kernel driver it is possible to add support with ease.
>>> There is one more PCI related counter and a device counter that aren't
>>> implemented in the Linux driver at the moment. These two are named
>>> outbound_pci_buffer_overflow and dev_out_of_buffer respectively. As
>>> per Nvidia's documentation these two counters can tell the number of
>>> packets dropped due to pci buffer overflow and the number of times the
>>> device owned queue had not enough buffers allocated.
>>> 
>>> Signed-off-by: Wathsala Vithanage <wathsala.vithanage at arm.com>
>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
>> 
>> Would it be possible to do this at PCI bus layer so all PCI devices have that
>> feature?
> PCIe performance counters mentioned here are exposed by the NIC itself and mlx5 kernel driver just passes them to userspace.
> If such a feature would be added at PCI bus layer, we would need to use (or add) some additional infrastructure.
> I'm not familiar with what Linux kernel exposes in terms of PCI counters. It's worth looking into.
> I'd assume such data can probably be extracted through PMU.
In our investigation, we did not find anything that Linux provides in terms of PCIe PMUs on PCIe root port. The best we found was these PCIe counters as seen by NIC.

It would be good to see other NICs providing similar and additional counters if any.



> 
> Best regards,
> Dariusz Sosnowski



More information about the dev mailing list