[dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays

Jerin Jacob jerinjacobk at gmail.com
Wed Jun 16 14:00:32 CEST 2021


On Wed, Jun 16, 2021 at 4:57 PM Morten Brørup <mb at smartsharesystems.com> wrote:
>
> > From: Jerin Jacob [mailto:jerinjacobk at gmail.com]
> > Sent: Wednesday, 16 June 2021 11.42
> >
> > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon <thomas at monjalon.net>
> > wrote:
> > >
> > > 14/06/2021 17:48, Morten Brørup:
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas
> > Monjalon
> > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> > something big enough to hold a sufficiently large array. And possibly
> > add an rte_max_ethports variable to indicate the number of populated
> > entries in the array, for use when iterating over the array.
> > > >
> > > > Can we come up with another example than RTE_MAX_ETHPORTS where
> > this library provides a better benefit?
> > >
> > > What is big enough?
> > > Is 640KB enough for RAM? ;)
> >
> > If I understand it correctly, Linux process allocates 640KB due to
> > that fact currently
> > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and it
> > is from BSS.
>
> Correct.
>
> > If we make this from heap i.e use malloc() to allocate this memory
> > then in my understanding Linux
> > really won't allocate the real page for backend memory until unless,
> > someone write/read to this memory.
>
> If the array is allocated from the heap, its members will be accessed though a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This might affect performance, which is probably why the array is allocated the way it is.
>
> Although it might be worth investigating how much it actually affects the performance.

it should not. From CPU and compiler PoV it is same.
if see cryptodev, it is using following

static struct rte_cryptodev rte_crypto_devices[RTE_CRYPTO_MAX_DEVS];
struct rte_cryptodev *rte_cryptodevs = rte_crypto_devices;

And accessing  rte_cryptodevs[].

Also, this structure is not cache aligned. Probably need to fix it.


> So we need to do something else if we want to conserve memory and still allow a large rte_eth_devices[] array.
>
> Looking at struct rte_eth_dev, we could reduce its size as follows:
>
> 1. Change the two callback arrays post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to callback arrays, which are allocated from the heap.
> With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays are the sinners that make the struct rte_eth_dev use so much memory. This modification would save 16 KB (minus 16 bytes for the pointers to the two arrays) per port.
> Furthermore, these callback arrays would only need to be allocated if the application is compiled with callbacks enabled (#define RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the actual number of queues for the port.
>
> The disadvantage is that this would add another level of indirection, although only for applications compiled with callbacks enabled.

I think, we don't need one more indirection if all allocated from the
heap. as memory is not wasted if not touched by CPU.

>
> 2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64 bytes per port. Not much, but worth considering if we are changing the API/ABI anyway.
>
>


More information about the dev mailing list