[dpdk-dev] librte_power w/ intel_pstate cpufreq governor

Hunt, David david.hunt at intel.com
Mon Mar 5 12:25:56 CET 2018


Hi BL,


On 5/3/2018 10:48 AM, longtb5 at viettel.com.vn wrote:
> Hi Dave,
>
> Actually in my test lab which is a HP box running CentOS 7 on kernel version
> 3.10.0-693.5.2.el7.x86_64, the default cpufreq driver is pcc_cpufreq. So I guess
> disabling intel_pstate wouldn't help in my case.
>
> # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
> pcc-cpufreq
>
> # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
> conservative userspace powersave ondemand performance
>
> According to kernel doc, pcc_cpufreq also doesn't export scaling_availabe_frequencies
> in sysfs.
>
>  From kernel doc:
> "scaling_available_frequencies is not created in /sys. No intermediate
> frequencies need to be listed because the BIOS will try to achieve any
> frequency, within limits, requested by the governor. A frequency does not have
> to be strictly associated with a P-state."
>
> The lack of scaling_availabe_frequencies makes power_acpi_cpufreq_init()
> complains, similar to the problem with intel_pstate as  in the other thread.
> I have tried (though with not much effort) to force the kernel
> to use acpi-cpufreq instead but without success.
>
> Luckily, as quoted above pcc_cpufreq supports setting of arbitrary frequency,
> so a simple workaround for now is to fake a scaling_available_frequencies file
> in another directory, then edit the code in librte_power to use that file instead.
>
> Regards,
> -BL
>
>> -----Original Message-----
>> From: david.hunt at intel.com [mailto:david.hunt at intel.com]
>> Sent: Monday, March 5, 2018 5:16 PM
>> To: longtb5 at viettel.com.vn; dev at dpdk.org
>> Subject: Re: [dpdk-dev] librte_power w/ intel_pstate cpufreq governor
>>
>> Hi BL,
>>
>> I have always used "intel_pstate=disable" in my kernel parameters at boot so
>> as to disable the intel_pstate driver, and force the kernel to use the acpi-
>> cpufreq driver:
>>
>> # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
>> acpi-cpufreq
>>
>> This then gives me the following options for the governor:
>> ['conservative', 'ondemand', 'userspace', 'powersave', 'performance',
>> 'schedutil']
>>
>> Because DPDK threads typically poll, they appear as 100% busy to the p_state
>> driver, so if you want to be able to change core frequency down (as in l3fwd-
>> power), you need to use the acpi-cpufreq driver.
>>
>> I had a read through the docs just now, and this does not seem to be
>> mentioned, so I'll do up a patch to give some information on the correct
>> kernel parameters to use when using the power library.
>>
>> Regards,
>> Dave.
>>
>> On 2/3/2018 7:20 AM, longtb5 at viettel.com.vn wrote:
>>> Forgot to link the original thread.
>>>
>>> http://dpdk.org/ml/archives/dev/2016-January/030930.html
>>>
>>> -BL
>>>
>>>> -----Original Message-----
>>>> From: longtb5 at viettel.com.vn [mailto:longtb5 at viettel.com.vn]
>>>> Sent: Friday, March 2, 2018 2:19 PM
>>>> To: dev at dpdk.org
>>>> Cc: david.hunt at intel.com; mhall at mhcomputing.net;
>>>> helin.zhang at intel.com; longtb5 at viettel.com.vn
>>>> Subject: librte_power w/ intel_pstate cpufreq governor
>>>>
>>>> Hi everybody,
>>>>
>>>> I know this thread was from over 2 years ago but I ran into the same
>>> problem
>>>> with l3fwd-power today.
>>>>
>>>> Any updates on this?
>>>>
>>>> -BL

Good to hear you found a workaround.

So the issue really is "Getting the Power Library working with the 
ppc-cpufreq kernel driver" :)

 From wiki.archlinux.org:
ppc-cpufreq: his driver supports Processor Clocking Control interface by 
Hewlett-Packard and Microsoft Corporation which is useful on some 
ProLiant servers.

In the following doc: 
https://www.kernel.org/doc/Documentation/cpu-freq/pcc-cpufreq.txt
it mentions - "When PCC mode is enabled, the platform will not expose 
processor performance or throttle states (_PSS, _TSS and related ACPI 
objects) to OSPM. Therefore,the native P-state driver (such as 
acpi-cpufreq for Intel, powernow-k8 forAMD) will not load".
Is there a way to disable PPC mode in the BIOS on that server? From that 
wording, it seems to imply imply that there is a way to disable PPC 
(seeing that it can be enabled).

If you can't disbale PPC, I would suggest that a patch may be needed to 
allow the power library detect if it's using acpi or ppc, and obtain a 
list of cpu frequencies accordingly. However, I don't have any HP 
servers available to me, so I'm currently unable to research a method of 
getting a list of valid cpu frequencies on a machine using the ppc driver.

If you come up with a snippet of code for listing available frequencies 
on that server, let me know and we can look at adding that into the 
power library. :)

Regards,
Dave.






More information about the dev mailing list