[dpdk-dev] [PATCH 0/3] timer: fix rte_timer_manage and improve unit tests

Sanford, Robert rsanford at akamai.com
Mon Jul 27 17:46:11 CEST 2015


Hi Thomas,

> Please, could you re-send this serie after having added the description
>of
> each patch in the commit messages?

Yes, I will move the paragraphs that begin with "Patch n" from patch 0 to
their respective patches.

> It seems you fix 2 bugs in the first patch. It may be clearer to split it
> in 2 patches with separate explanations.

No, I respectfully disagree. The only bug that we address in patch 1 is
that the slaves become out of sync with the master.
The section that begins with "Description of rte_timer_manage() race
condition bug" is a general description to give background info for both
patches 2 and 3. I would like to leave that section in part 0, if it's OK
with you.

--
Thanks,
Robert



>2015-07-23 18:42, rsanford2 at gmail.com:
>> From: Robert Sanford <rsanford at akamai.com>
>> 
>> This patchset fixes a bug in timer stress test 2, adds a new stress test
>> to expose a race condition bug in API rte_timer_manage(), and then fixes
>> the rte_timer_manage() bug.
>> --
>> 
>> Patch 1, app/test timer stress test 2: Sometimes this test fails and
>> seg-faults because the slave lcores get out of phase with the master.
>> The master uses a single int, 'ready', to synchronize multiple slave
>> lcores through multiple phases of the test.
>> 
>> To resolve, we construct simple synchronization primitives that use one
>> atomic-int state variable per slave. The master tells the slaves when to
>> start, and then waits for all of them to finish. Each slave waits for
>> the master to tell it to start, and then tells the master when it has
>> finished.
>> --
>> 
>> Description of rte_timer_manage() race condition bug: Through code
>> inspection, we noticed a potential problem in rte_timer_manage() that
>> leads to corruption of per-lcore pending-lists (implemented as
>> skip-lists). The race condition occurs when rte_timer_manage() expires
>> multiple timers on lcore A, while lcore B simultaneously invokes
>> rte_timer_reset() for one of the expiring timers (other than the first
>> one).
>> 
>> Lcore A splits its pending-list, creating a local list of expired timers
>> linked through their sl_next[0] pointers, and sets the first expired
>> timer to the RUNNING state, all during one list-lock round trip. Lcore A
>> then unlocks the list-lock to run the first callback, and that is when
>> lcore B can misinterpret the subsequent expired timers' true states.
>> Lcore B sees an expired timer still in the PENDING state, locks A's
>> list-lock, and reinserts the timer into A's pending-list. We are trying
>> to use the same pointer to belong to both lists!
>> 
>> One solution is to remove timers from the pending-list and set them to
>> the RUNNING state in one atomic step, i.e., we should perform these two
>> actions within one ownership of the list-lock.
>> --
>> 
>> Patch 2, new timer-manage race-condition test: We wrote a test to
>> confirm our suspicion that we could crash rte_timer_manage() (RTM) under
>> the right circumstances. We repeatedly set several timers to expire at
>> roughly the same time on the master core. The master lcore just delays
>> and runs RTM about ten times per second. The slave lcores all watch the
>> first timer (timer-0) to see when RTM is running on the master, i.e.,
>> timer-0's state is not PENDING. At this point, each slave attempts to
>> reset a subset of the timers to a later expiration time. The goal here
>> is to have the slaves moving most of the timers to a different place in
>> the master's pending-list, while the master is working with the same
>> next-pointers and running callback functions. This eventually results in
>> the master traversing a corrupted linked-list. In our observations, it
>> results in an infinite loop.
>> --
>> 
>> Patch 3, eliminate race condition in rte_timer_manage(): After splitting
>> the pending-list at the current time point, we try to set all expired
>> timers to the RUNNING state, and put back into the pending-list any
>> timers that we fail to set to the RUNNING state, all during one
>> list-lock round trip. It is then safe to run the callback functions
>> for all expired timers that remain on our local list, without the lock.
>
>Please, could you re-send this serie after having added the description of
>each patch in the commit messages?
>It seems you fix 2 bugs in the first patch. It may be clearer to split it
>in 2 patches with separate explanations.
>
>Thanks



More information about the dev mailing list