Bug 1367
Summary: | net/mlx5 Tx stuck if mbuf has too many segments | ||
---|---|---|---|
Product: | DPDK | Reporter: | Andrew Rybchenko (andrew.rybchenko) |
Component: | ethdev | Assignee: | dev |
Status: | UNCONFIRMED --- | ||
Severity: | normal | CC: | viacheslavo |
Priority: | Normal | ||
Version: | 23.11 | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All |
Description
Andrew Rybchenko
2024-01-18 08:28:37 CET
mlx5 PMD neither uses nor implements tx_pkt_prepare() API call. On a packet with a segment number exceeding the HW capabilities (as well as on any problematic packet), the PMD stops burst processing and increments tx_queue->oerror counter. Normally, it is supposed to have "oerrors" counter zero, otherwise application violates some PMD sending rules (for example, exceeds the number of the segments). It is not a runtime error, but rather a design error, so adding some handling code to the tx_burst would impact the performance and not add some value. App should behave: - do tx_burst - if not all pkts are sent - check oerrors - if oerrors is not zero/changed report critical error Then debug/find root cause/update design. Thanks for quick feedback. Is it documented somewhere else? Does testpmd behave this way? IMHO it is a bit strange that oerror is incremented for a packet not accepted for Tx. It is one more gray area in DPDK. >>Is it documented somewhere else? I did a quick searching in docs. Explicitly it is not specified, just some logic is there: https://doc.dpdk.org/guides/howto/debug_troubleshoot.html?highlight=oerrors >> Does testpmd behave this way? No, testpmd does not. Maybe we should extend its code. One more reason not just to skip the bad packet - we want to report to the application what packet caused an error - the first unsent packet, it was found useful to debug user cases. What is not good - getting the oerrors is done via API and it takes some time, so, it seems to be not efficient to check oerrors on every iteration. So, you're sure that it is OK. Definitely up to you. When I have some time I'll fill in expectations for the driver. Is 40 a driver or HW limitation? If HW, which NICs? >>Is 40 a driver or HW limitation? If HW, which NICs?
The absolute limit of the segments for WQE (hardware descriptor of ConnectX NIC series) is 63 (6-bit size field width), 61 of them can be data segments, so 61 is hypothetical limit for number of mbufs in chain.
But there is also some data inlining into WQE (feature to save some PCIe bandwidth), so PMD reports reduced limit, according to actual inline settings.
|