[dpdk-dev] [PATCH] eal: add new prefetch0_write variant

Pavan Nikhilesh Bhagavatula pbhagavatula at marvell.com
Mon Sep 14 12:39:02 CEST 2020


>> >This commit adds a new rte_prefetch0_write() variant, suggests to
>the
>> >compiler to use a prefetch instruction with intention to write. As a
>> >compiler builtin, the compiler can choose based on compilation
>target
>> >what the best implementation for this instruction is.
>>
>> Why not have the other variants too i.e. l2/l3/temporal store
>prefetches too?
>
>Hi Pavan,
>
Hi Harry,
(LTNS)

>Are there architectures that actually implement those? Usually for a WB
>mem store to complete,
>the data must be present in L1 cache (on x86 at least), and that's what
>the patch below with write0 achieves.

ARM64 does supports all modes of store prefetch
"
<type> is one of:
PLD Prefetch for load, encoded in the "Rt<4:3>" field as 0b00.
PLI Preload instructions, encoded in the "Rt<4:3>" field as 0b01.
PST Prefetch for store, encoded in the "Rt<4:3>" field as 0b10.
<target> is one of:
L1 Level 1 cache, encoded in the "Rt<2:1>" field as 0b00.
L2 Level 2 cache, encoded in the "Rt<2:1>" field as 0b01.
L3 Level 3 cache, encoded in the "Rt<2:1>" field as 0b10.
<policy> is one of:
KEEP Retained or temporal prefetch, allocated in the cache normally. Encoded in the "Rt<0>"
field as 0.
STRM Streaming or non-temporal prefetch, for data that is used only once. Encoded in the
"Rt<0>" field as 1.
For more information on these prefetch
"

>
>I'm against adding all the variants "just in case", it leads to API bloat,
>and increases
>cognitive load on the programmer. My expectation is that in 99% of
>usage the prefetch
>write instruction should target L1.
>

There is a use case when cache mode is write through and application is 
pipelining work across cores sharing same L2 cluster.

>Cheers, -Harry

Regards,
Pavan.

>
>> >Signed-off-by: Harry van Haaren <harry.van.haaren at intel.com>
>> >
>> >---
>> >
>> >The integer constants passed to the builtin are not available as
>> >a #define value, and doing #defines just for this write variant
>> >does not seems a nice solution to me... particularly for those using
>> >IDEs where any #define value is auto-hinted for code-completion.
>> >
>> >---
>> > lib/librte_eal/include/generic/rte_prefetch.h | 16
>++++++++++++++++
>> > 1 file changed, 16 insertions(+)
>
><snip patch contents>



More information about the dev mailing list