Bug 946 - async flow rules affects pmd power management
Summary: async flow rules affects pmd power management
Status: UNCONFIRMED
Alias: None
Product: DPDK
Classification: Unclassified
Component: ethdev (show other bugs)
Version: unspecified
Hardware: All All
: Normal normal
Target Milestone: ---
Assignee: dev
URL:
Depends on:
Blocks:
 
Reported: 2022-03-02 13:27 CET by David Hunt
Modified: 2022-03-03 13:46 CET (History)
2 users (show)



Attachments

Description David Hunt 2022-03-02 13:27:15 CET
pmd power management functionality when using RTM transactions never enters power saving mode. 

When testing using l3fwd-power to test the power_pmd functionality for DPDK 22.03 rc2, I noticed that the power saved in this DPDK version was significantly reduced. 
I tried multiple BIOS/OS/kernel combinations, but then I tried DPDK 21.11, where the problem went away. 

I then did a git bisect, which narrowed down the problem to commit 197e820c6685993ad75387de79707c81b5e1fc10 - "ethdev: bring in async queue-based flow rules operations"

The patch has the effect of causing the RTM transaction in rte_power_monitor_multi(), in the C file lib/eal/x86/rte_power_intrinsics.c, to always fail at rte_xbegin, so we never get to the rte_power_pause, thereby we never get to save any power. 
If we revert just this patch, then the rte_xbegin succeeds, and we get to call rte_power_pause, and save power.

So this patch is now causing rte_power_monitor_multi() to fail. 

I'm not sure what the solution is, but it may be possible to re-work the async flow rules patch, or it may be possible to re-work the pmd_power_management code to avoid the impact of the flow rules patch. Investigation is ongoing.
Comment 1 Thomas Monjalon 2022-03-02 14:03:50 CET
This patch is mostly adding new functions to rte_flow.
Please could you elaborate which change in this patch
is the cause of rte_xbegin to fail?
Comment 2 David Hunt 2022-03-02 17:40:15 CET
It's looking like its a compiler issue. We can repeat the issue using gcc9 with static libraries, but not with shared libraries. Also, gcc11 does not exhibit the issue. 
Also, if we build with a subset of NIC drivers, the issue disappears. 
So it's probably something do with code alignment or offsets around RTM transactions in gcc9  static builds. 

IMO the patches are fine, just a problem with gcc9. 

Building with a different set of drivers may fix the issue, but a better way is just to use gcc11 (we're on Ubuntu where both are supported).
Comment 3 Thomas Monjalon 2022-03-03 09:41:55 CET
Is there any known bug regarding RTM and GCC 9?
Comment 4 David Hunt 2022-03-03 13:46:54 CET
I cannot find any known bug in this area. We are going to contact our gcc colleagues for more information. 

I think the best path forward for this release is to proceed with the code as is, and add an errata for the RTM portion of the power library, along with mitigation suggestions.

If RTM is disabled/not available it will use the regular pause instruction, still saving power, but not quite as much as with RTM and TPAUSE. 

Also, there are a couple of workarounds, reduce the number of drivers compiled into the application, or using gcc-11.

Note You need to log in before you can comment on or make changes to this bug.