[PATCH v2] net/mlx5: enable PCI related counters

Wathsala Vithanage wathsala.vithanage at arm.com
Wed Feb 14 21:14:55 CET 2024


Versions of Mellanox NICs starting from CX5 have device counters
related to PCI. These counters are helpful in debugging IO
bottlenecks. For instance, the outbound_pci_stalled_rd and
outbound_pci_stalled_wr counters can help with identifying NIC
stalls due to insufficient PCI credits, which otherwise would
have required a PCI analyzer or a sophisticated PCI root port
with a PMU.
Currently none of these are available in the MLX5 PMD even
though ethtool is capable of reading some of them.
Since PMD uses the same ioctl used by ethtool (SIOCETHTOOL) and
reads via the kernel driver it is possible to add support with
ease.
There is one more PCI related counter and a device counter that
aren't implemented in the Linux driver at the moment. These two
are named outbound_pci_buffer_overflow and dev_out_of_buffer
respectively. As per Nvidia's documentation these two counters
can tell the number of packets dropped due to pci buffer
overflow and the number of times the device owned queue had not
enough buffers allocated.

Signed-off-by: Wathsala Vithanage <wathsala.vithanage at arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
Acked-by: Dariusz Sosnowski <dsosnowski at nvidia.com>
---
 .mailmap                                |  1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 41 +++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/.mailmap b/.mailmap
index aa569ff456..f57415f7a1 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1510,6 +1510,7 @@ Walter Heymans <walter.heymans at corigine.com>
 Wang Sheng-Hui <shhuiw at gmail.com>
 Wangyu (Eric) <seven.wangyu at huawei.com>
 Waterman Cao <waterman.cao at intel.com>
+Wathsala Vithanage <wathsala.vithanage at arm.com>
 Weichun Chen <weichunx.chen at intel.com>
 Wei Dai <wei.dai at intel.com>
 Weifeng Li <liweifeng96 at 126.com>
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index dd5a0c546d..c837c862a8 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1574,6 +1574,47 @@ static const struct mlx5_counter_ctrl mlx5_counters_init[] = {
 		.dpdk_name = "tx_vport_bytes",
 		.ctr_name = "vport_tx_bytes",
 	},
+	/**
+	 * Device counters: These counters are for the
+	 * entire PCI device (NIC). These counters are
+	 * not counting on a per port/queue basis.
+	 * Values reported by these counters may not be
+	 * useful if the device is bifurcated and queues
+	 * are shared with the kernel or other DPDK
+	 * applications.
+	 */
+	{
+		.dpdk_name = "rx_pci_signal_integrity",
+		.ctr_name = "rx_pci_signal_integrity",
+	},
+	{
+		.dpdk_name = "tx_pci_signal_integrity",
+		.ctr_name = "tx_pci_signal_integrity",
+	},
+	{
+		.dpdk_name = "outbound_pci_buffer_overflow",
+		.ctr_name = "outbound_pci_buffer_overflow",
+	},
+	{
+		.dpdk_name = "outbound_pci_stalled_rd",
+		.ctr_name = "outbound_pci_stalled_rd",
+	},
+	{
+		.dpdk_name = "outbound_pci_stalled_wr",
+		.ctr_name = "outbound_pci_stalled_wr",
+	},
+	{
+		.dpdk_name = "outbound_pci_stalled_rd_events",
+		.ctr_name = "outbound_pci_stalled_rd_events",
+	},
+	{
+		.dpdk_name = "outbound_pci_stalled_wr_events",
+		.ctr_name = "outbound_pci_stalled_wr_events",
+	},
+	{
+		.dpdk_name = "dev_out_of_buffer",
+		.ctr_name = "dev_out_of_buffer",
+	},
 };
 
 static const unsigned int xstats_n = RTE_DIM(mlx5_counters_init);
-- 
2.25.1



More information about the dev mailing list