[PATCH v2] eal/unix: allow creating thread with real-time priority

Stephen Hemminger stephen at networkplumber.org
Wed Oct 25 23:33:18 CEST 2023


On Wed, 25 Oct 2023 19:54:06 +0200
Morten Brørup <mb at smartsharesystems.com> wrote:

> I agree with Thomas on this.
> 
> If you want the log message, please degrade it to INFO or DEBUG level. It is only relevant when chasing problems, not for normal production - and thus NOTICE is too high.

I don't want the message to be hidden.
If we get any bug reports want to be able to say "read the log, don't do that".

> Someone might build a kernel with options to keep non-dataplane threads off some dedicated CPU cores, so they can be used for guaranteed low-latency dataplane threads. We do. We don't use real-time priority, though.

This is really, hard to do. Isolated CPU's are not isolated from interrupts and other sources which end up scheduling work as kernel threads. Plus there is the behavior where kernel decides to turn a soft irq into a kernel thread, then starve
itself. Under starvation, disk corruption is likely if interrupts never get processed :-(

> For reference, we did some experiments (using this custom built kernel) with a dedicated thread doing nothing but a loop calling rte_rdtsc_precise() and registering the delta. Although the overwhelming majority is ca. CPU 80 cycles, there are some big outliers at ca. 9,000 CPU cycles. (Order of magnitude: ca. 45 of these big outliers per minute.) Apparently some kernel threads steal some cycles from this thread, regardless of our customizations. We haven't bothered analyzing and optimizing it further.

Was this on isolated CPU?
Did you check that that CPU was excluded from the smp_affinty mask on all devices?
Did you enable the kernel feature to avoid clock ticks if CPU is dedicated?
Same thing for RCU, need to adjust parameters?

Also, on many systems there can be SMI BIOS hidden execution that will cause big outliers.

Lastly never try and use CPU 0. The kernel uses CPU 0 as catch all in lots of places.

> I think our experiment supports the need to allow kernel threads to run, e.g. by calling sleep() or similar, when an EAL thread has real-time priority.



More information about the stable mailing list