[dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Tue Jun 15 16:37:47 CEST 2021


<snip>

> 
> 15/06/2021 12:08, Ananyev, Konstantin:
> > > 15/06/2021 11:33, Ananyev, Konstantin:
> > > > > 14/06/2021 17:48, Jerin Jacob:
> > > > > > On Mon, Jun 14, 2021 at 8:29 PM Ananyev, Konstantin
> > > > > > <konstantin.ananyev at intel.com> wrote:
> > > > > > > I had only a quick look at your approach so far.
> > > > > > > But from what I can read, in MT environment your suggestion
> > > > > > > will require extra synchronization for each read-write access to
> such parray element (lock, rcu, ...).
> > > > > > > I think what Bruce suggests will be much ligther, easier to
> implement and less error prone.
> > > > > > > At least for rte_ethdevs[] and friends.
> > > > > >
> > > > > > +1
> > > > >
> > > > > Please could you have a deeper look and tell me why we need more
> locks?
> > > > > The element pointers doesn't change.
> > > > > Only the array pointer change at resize,
> > > >
> > > > Yes, array pointer changes at resize, and reader has to read that
> > > > value to access elements in the parray. Which means that we need
> > > > some sync between readers and updaters to avoid reader using stale
> pointer (ref-counter, rcu, etc.).
> > >
> > > No
> > > The old array is still there, so we don't need sync.
> > >
> > > > I.E. updater can free old array pointer *only* when it can
> > > > guarantee that there are no readers that still use it.
> > >
> > > No
> > > Reading an element is OK because the pointer to the element is not
> changed.
> > > Getting the pointer to an element from the index is the only thing
> > > which is blocking the freeing of an array, and I see no reason why
> > > dereferencing an index would be longer than 2 consecutive resizes of
> > > the array.
> >
> > In general, your thread can be switched off the cpu at any moment.
> > And you don't know for sure when it will be scheduled back.
> >
> > >
> > > > > but the old one is still usable until the next resize.
> > > >
> > > > Ok, but what is the guarantee that reader would *always* finish till next
> resize?
> > > > As an example of such race condition:
> > > >
> > > > /* global one */
> > > > 	struct rte_parray pa;
> > > >
> > > > /* thread #1, tries to read elem from the array */
> > > >  	....
> > > > 	int **x = pa->array;
> > >
> > > We should not save the array pointer.
> > > Each index must be dereferenced with the macro getting the current
> > > array pointer.
> > > So the interrupt is during dereference of a single index.
> >
> > You still need to read your pa->array somewhere (let say into a register).
> > Straight after that your thread can be interrupted.
> > Then when it is scheduled back to the CPU that value (in a register) might be
> s stale one.
> >
> > >
> > > > /* thread # 1 get suspended for a while  at that point */
> > > >
> > > > /* meanwhile thread #2 does: */
> > > > 	....
> > > > 	/* causes first resize(), x still valid, points to pa->old_array */
> > > > 	rte_parray_alloc(&pa, ...);
> > > > 	.....
> > > > 	/* causes second resize(), x now points to freed memory */
> > > > 	rte_parray_alloc(&pa, ...);
> > > > 	...
> > >
> > > 2 resizes is a very long time, it is at minimum 33 allocations!
> > >
> > > > /* at that point thread #1 resumes: */
> > > >
> > > > 	/* contents of x[0] are undefined, 'p' could point anywhere,
> > > > 	     might cause segfault or silent memory corruption */
> > > > 	int *p = x[0];
> > > >
> > > >
> > > > Yes probability of such situation is quite small.
> > > > But it is still possible.
> > >
> > > In device probing, I don't see how it is realistically possible:
> > > 33 device allocations during 1 device index being dereferenced.
> >
> > Yeh, it would work fine 1M times, but sometimes will crash.
> 
> Sometimes a thread will be interrupted during 33 device allocations?
> 
> > Which will make it even harder to reproduce, debug and fix.
> > I think that when introducing a new generic library into DPDK, we
> > should avoid making such assumptions.
> 
> I intend to make it internal-only (I should have named it eal_parray).
> 
> > > I agree it is tricky, but that's the whole point of finding tricks
> > > to keep fast code.
> >
> > It is not tricky, it is buggy 😊
> > You introducing a race condition into the new core generic library by
> > design, and trying to convince people that it is *OK*.
> 
> Yes, because I am convinced myself.
> 
> > Sorry, but NACK from me till that issue will be addressed.
Agree here that a synchronization mechanism is required to indicate when it is safe to free the old array. An ACK from the readers is required to free the old array. We cannot use "enough time has passed" argument.

As others have mentioned, I think the key is the use case. Not all use cases require a dynamically resized array. Dynamically allocated array at init time would be enough.

If a dynamically resized array is required, using RCU (or any other mechanism) is necessary. I do not think these use cases should be characterized by the size of the memory/array in question (it might be a small chunk in a system with abundant memory, but might be a big chunk in a system with small amount of memory). The current RCU library provides good options to hide complexities from the application or allow the application to handle complexities if it wants.

> 
> It is not an issue, but a design.
> If you think that a thread can be interrupted during 33 device allocations then
> we should find another implementation, but I am quite sure it will be slower.
> 



More information about the dev mailing list