[PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing

Raslan Darawsheh rasland at nvidia.com
Tue Jun 20 14:00:05 CEST 2023


Hi,

> -----Original Message-----
> From: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
> Sent: Tuesday, June 13, 2023 7:59 PM
> To: dev at dpdk.org
> Subject: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing
> 
> The mlx5 provides the send scheduling on specific moment of time,
> and for the related kind of applications it would be extremely useful
> to have extra debug information - when and how packets were scheduled
> and when the actual sending was completed by the NIC hardware (it helps
> application to track the internal delay issues).
> 
> Because the DPDK tx datapath API does not suppose getting any feedback
> from the driver and the feature looks like to be mlx5 specific, it seems
> to be reasonable to engage exisiting DPDK datapath tracing capability.
> 
> The work cycle is supposed to be:
>   - compile appplication with enabled tracing
>   - run application with EAL parameters configuring the tracing in mlx5
>     Tx datapath
>   - store the dump file with gathered tracing information
>   - run analyzing scrypt (in Python) to combine related events (packet
>     firing and completion) and see the data in human-readable view
> 
> Below is the detailed instruction "how to" with mlx5 NIC to gather
> all the debug data including the full timings information.
> 
> 
> 1. Build DPDK application with enabled datapath tracing
> 
> The meson option should be specified:
>    --enable_trace_fp=true
> 
> The c_args shoudl be specified:
>    -DALLOW_EXPERIMENTAL_API
> 
> The DPDK configuration examples:
> 
>   meson configure --buildtype=debug -Denable_trace_fp=true
>         -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT -
> DALLOW_EXPERIMENTAL_API' build
> 
>   meson configure --buildtype=debug -Denable_trace_fp=true
>         -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
> 
>   meson configure --buildtype=release -Denable_trace_fp=true
>         -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
> 
>   meson configure --buildtype=release -Denable_trace_fp=true
>         -Dc_args='-DALLOW_EXPERIMENTAL_API' build
> 
> 
> 2. Configuring the NIC
> 
> If the sending completion timings are important the NIC should be configured
> to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings
> parameter
> should be configured to TRUE, for example with command (and with following
> FW/driver reset):
> 
>   sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s
> REAL_TIME_CLOCK_ENABLE=1
> 
> 
> 3. Run DPDK application to gather the traces
> 
> EAL parameters controlling trace capability in runtime
> 
>   --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints
>                             with matching names at least "pmd.net.mlx5.tx"
>                             must be enabled to gather all events needed
>                             to analyze mlx5 Tx datapath and its timings.
>                             By default all tracepoints are disabled.
> 
>   --trace-dir=/var/log - trace storing directory
> 
>   --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size
>                                        per thread. The default is 1MB.
> 
>   --trace-mode=overwrite|discard  - optional, selects trace data buffer mode.
> 
> 
> 4. Installing or Building Babeltrace2 Package
> 
> The gathered trace data can be analyzed with a developed Python script.
> To parse the trace, the data script uses the Babeltrace2 library.
> The package should be either installed or built from source code as
> shown below:
> 
>   git clone https://github.com/efficios/babeltrace.git
>   cd babeltrace
>   ./bootstrap
>   ./configure -help
>   ./configure --disable-api-doc --disable-man-pages
>               --disable-python-bindings-doc --enbale-python-plugins
>               --enable-python-binding
> 
> 5. Running the Analyzing Script
> 
> The analyzing script is located in the folder: ./drivers/net/mlx5/tools
> It requires Python3.6, Babeltrace2 packages and it takes the only parameter
> of trace data file. For example:
> 
>    ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39
> 
> 
> 6. Interpreting the Script Output Data
> 
> All the timings are given in nanoseconds.
> The list of Tx (and coming Rx) bursts per port/queue is presented in the
> output.
> Each list element contains the list of built WQEs with specific opcodes, and
> each WQE contains the list of the encompassed packets to send.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
> 
> --
> v2: - comment addressed: "dump_trace" command is replaced with
> "save_trace"
>     - Windows build failure addressed, Windows does not support tracing
> 
> Viacheslav Ovsiienko (5):
>   app/testpmd: add trace save command
>   common/mlx5: introduce tracepoints for mlx5 drivers
>   net/mlx5: add Tx datapath tracing
>   net/mlx5: add comprehensive send completion trace
>   net/mlx5: add Tx datapath trace analyzing script
> 
>  app/test-pmd/cmdline.c               |  38 ++++
>  drivers/common/mlx5/meson.build      |   1 +
>  drivers/common/mlx5/mlx5_trace.c     |  25 +++
>  drivers/common/mlx5/mlx5_trace.h     |  72 +++++++
>  drivers/common/mlx5/version.map      |   8 +
>  drivers/net/mlx5/linux/mlx5_verbs.c  |   8 +-
>  drivers/net/mlx5/mlx5_devx.c         |   8 +-
>  drivers/net/mlx5/mlx5_rx.h           |  19 --
>  drivers/net/mlx5/mlx5_rxtx.h         |  19 ++
>  drivers/net/mlx5/mlx5_tx.c           |   9 +
>  drivers/net/mlx5/mlx5_tx.h           |  88 ++++++++-
>  drivers/net/mlx5/tools/mlx5_trace.py | 271
> +++++++++++++++++++++++++++
>  12 files changed, 537 insertions(+), 29 deletions(-)
>  create mode 100644 drivers/common/mlx5/mlx5_trace.c
>  create mode 100644 drivers/common/mlx5/mlx5_trace.h
>  create mode 100755 drivers/net/mlx5/tools/mlx5_trace.py
> 
> --
> 2.18.1

Series applied to next-net-mlx,

Kindest regards
Raslan Darawsheh


More information about the dev mailing list