Question about loop unrolling in rte_ring datastructure.

Aditya Ambadipudi Aditya.Ambadipudi at arm.com
Mon Nov 13 19:14:53 CET 2023


Hello all.

My name is Aditya Ambadipudi. I am not the sharpest tool in the shed.

I was reading through the rte_ring datastructure. And I have two questions about the optimizations that are being made there.


  1.  Loop unrolling:
https://github.com/DPDK/dpdk/blob/main/lib/ring/rte_ring_elem_pvt.h#L28-L35
Why are we unrolling these loops manually. GCC will generate SIMD instructions for these loops automatically. Irrespective of wheither or not we unroll the loops

Unrolled loop: https://godbolt.org/z/n97noqYn7
Regular loop:https://godbolt.org/z/h6G9o9773

This is true of both x86 and ARM.

  2.  Normalizing to few fixed types:

It looks like we separate out enqueue/dequeue operations into 3 functions. One for each element size 32, 64, 128.

Again I am not clear on why we are doing this. Both 128 & 64 are multiples of 32. Why can't we just normalize everything to 32?

I feel like this is in some shape or form related to loop unrolling. But I am not able to figure it out on my own.

I am working on a patch that is closely related to this. And I would greatly appreciate any assistance anyone can provide on this.

Thank you,
Aditya Ambadipudi
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/dev/attachments/20231113/525caaa7/attachment.htm>


More information about the dev mailing list