[dpdk-dev] [PATCH 0/3] *** timer library enhancements ***

Carrillo, Erik G erik.g.carrillo at intel.com
Thu Aug 24 16:08:33 CEST 2017



> -----Original Message-----
> From: Wiles, Keith
> Sent: Wednesday, August 23, 2017 4:05 PM
> To: Carrillo, Erik G <erik.g.carrillo at intel.com>
> Cc: rsanford at akamai.com; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements ***
> 
> 
> > On Aug 23, 2017, at 2:28 PM, Carrillo, Erik G <erik.g.carrillo at intel.com>
> wrote:
> >
> >>
> >> -----Original Message-----
> >> From: Wiles, Keith
> >> Sent: Wednesday, August 23, 2017 11:50 AM
> >> To: Carrillo, Erik G <erik.g.carrillo at intel.com>
> >> Cc: rsanford at akamai.com; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements
> >> ***
> >>
> >>
> >>> On Aug 23, 2017, at 11:19 AM, Carrillo, Erik G
> >>> <erik.g.carrillo at intel.com>
> >> wrote:
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Wiles, Keith
> >>>> Sent: Wednesday, August 23, 2017 10:02 AM
> >>>> To: Carrillo, Erik G <erik.g.carrillo at intel.com>
> >>>> Cc: rsanford at akamai.com; dev at dpdk.org
> >>>> Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements
> >>>> ***
> >>>>
> >>>>
> >>>>> On Aug 23, 2017, at 9:47 AM, Gabriel Carrillo
> >>>>> <erik.g.carrillo at intel.com>
> >>>> wrote:
> >>>>>
> >>>>> In the current implementation of the DPDK timer library, timers
> >>>>> can be created and set to be handled by a target lcore by adding
> >>>>> it to a skiplist that corresponds to that lcore.  However, if an
> >>>>> application enables multiple lcores, and each of these lcores
> >>>>> repeatedly attempts to install timers on the same target lcore,
> >>>>> overall application throughput will be reduced as all lcores
> >>>>> contend to acquire the lock guarding the single skiplist of pending
> timers.
> >>>>>
> >>>>> This patchset addresses this scenario by adding an array of
> >>>>> skiplists to each lcore's priv_timer struct, such that when lcore
> >>>>> i installs a timer on lcore k, the timer will be added to the ith
> >>>>> skiplist for lcore k.  If lcore j installs a timer on lcore k
> >>>>> simultaneously, lcores i and j can both proceed since they will be
> >>>>> acquiring different locks for different lists.
> >>>>>
> >>>>> When lcore k processes its pending timers, it will traverse each
> >>>>> skiplist in its array and acquire a skiplist's lock while a run
> >>>>> list is broken out; meanwhile, all other lists can continue to be
> modified.
> >>>>> Then, all run lists for lcore k are collected and traversed
> >>>>> together so timers are executed in their global order.
> >>>>
> >>>> What is the performance and/or latency added to the timeout now?
> >>>>
> >>>> I worry about the case when just about all of the cores are
> >>>> enabled, which could be as high was 128 or more now.
> >>>
> >>> There is a case in the timer_perf_autotest that runs
> >>> rte_timer_manage
> >> with zero timers that can give a sense of the added latency.   When run
> with
> >> one lcore, it completes in around 25 cycles.  When run with 43 lcores
> >> (the highest I have access to at the moment), rte_timer_mange
> >> completes in around 155 cycles.  So it looks like each added lcore
> >> adds around 3 cycles of overhead for checking empty lists in my testing.
> >>
> >> Does this mean we have only 25 cycles on the current design or is the
> >> 25 cycles for the new design?
> >>
> >
> > Both - when run with one lcore, the new design becomes equivalent to the
> original one.  I tested the current design to confirm.
> 
> Good thanks
> 
> >
> >> If for the new design, then what is the old design cost compared to
> >> the new cost.
> >>
> >> I also think we need the call to a timer function in the calculation,
> >> just to make sure we have at least one timer in the list and we
> >> account for any short cuts in the code for no timers active.
> >>
> >
> > Looking at the numbers for non-empty lists in timer_perf_autotest, the
> overhead appears to fall away.  Here are some representative runs for
> timer_perf_autotest:
> >
> > 43 lcores enabled, installing 1M timers on an lcore and processing them
> with current design:
> >
> > <...snipped...>
> > Appending 1000000 timers
> > Time for 1000000 timers: 424066294 (193ms), Time per timer: 424 (0us)
> > Time for 1000000 callbacks: 73124504 (33ms), Time per callback: 73
> > (0us) Resetting 1000000 timers Time for 1000000 timers: 1406756396
> > (641ms), Time per timer: 1406 (1us) <...snipped...>
> >
> > 43 lcores enabled, installing 1M timers on an lcore and processing them
> with proposed design:
> >
> > <...snipped...>
> > Appending 1000000 timers
> > Time for 1000000 timers: 382912762 (174ms), Time per timer: 382 (0us)
> > Time for 1000000 callbacks: 79194418 (36ms), Time per callback: 79
> > (0us) Resetting 1000000 timers Time for 1000000 timers: 1427189116
> > (650ms), Time per timer: 1427 (1us) <...snipped…>
> 
> it looks ok then. The main concern I had was the timers in Pktgen and
> someone telling the jitter increase or latency or performance. I guess I will
> just have to wait an see.
> 
> >
> > The above are not averages, so the numbers don't really indicate which is
> faster, but they show that the overhead of the proposed design should not
> be appreciable.
> >
> >>>
> >>>>
> >>>> One option is to have the lcore j that wants to install a timer on
> >>>> lcore k to pass a message via a ring to lcore k to add that timer.
> >>>> We could even add that logic into setting a timer on a different
> >>>> lcore then the caller in the current API. The ring would be a
> >>>> multi-producer and
> >> single consumer, we still have the lock.
> >>>> What am I missing here?
> >>>>
> >>>
> >>> I did try this approach: initially I had a multi-producer
> >>> single-consumer ring
> >> that would hold requests to add or delete a timer from lcore k's
> >> skiplist, but it didn't really give an appreciable increase in my test
> application throughput.
> >> In profiling this solution, the hotspot had moved from acquiring the
> >> skiplist's spinlock to the rte_atomic32_cmpset that the
> >> multiple-producer ring code uses to manipulate the head pointer.
> >>>
> >>> Then, I tried multiple single-producer single-consumer rings per
> >>> target
> >> lcore.  This removed the ring hotspot, but the performance didn't
> >> increase as much as with the proposed solution. These solutions also
> >> add overhead to rte_timer_manage, as it would have to process the
> >> rings and then process the skiplists.
> >>>
> >>> One other thing to note is that a solution that uses such messages
> >>> changes
> >> the use models for the timer.  One interesting example is:
> >>> - lcore I enqueues a message to install a timer on lcore k
> >>> - lcore k runs rte_timer_manage, processes its messages and adds the
> >>> timer to its list
> >>> - lcore I then enqueues a message to stop the same timer, now owned
> >>> by lcore k
> >>> - lcore k does not run rte_timer_manage again
> >>> - lcore I wants to free the timer but it might not be safe
> >>
> >> This case seems like a mistake to me as lcore k should continue to
> >> call
> >> rte_timer_manager() to process any new timers from other lcores not
> >> just the case where the list becomes empty and lcore k does not add
> >> timer to his list.
> >>
> >>>
> >>> Even though lcore I has successfully enqueued the request to stop
> >>> the
> >> timer (and delete it from lcore k's pending list), it hasn't actually
> >> been deleted from the list yet,  so freeing it could corrupt the
> >> list.  This case exists in the existing timer stress tests.
> >>>
> >>> Another interesting scenario is:
> >>> - lcore I resets a timer to install it on lcore k
> >>> - lcore j resets the same timer to install it on lcore k
> >>> - then, lcore k runs timer_manage
> >>
> >> This one also seems like a mistake, more then one lcore setting the
> >> same timer seems like a problem and should not be done. A lcore
> >> should own a timer and no other lcore should be able to change that
> >> timer. If multiple lcores need a timer then they should not share the same
> timer structure.
> >>
> >
> > Both of the above cases exist in the timer library stress tests, so a solution
> would presumably need to address them or it would be less flexible.  The
> original design passed these tests, as does the proposed one.
> 
> I get this twitch when one lcore is adding timers to another lcore as I come
> from a realtime OS background, but I guess if no one else cares or finds a
> problem I will have to live with it. Having a test for something does not make
> it a good test or a reasonable reason to continue a design issue. We can make
> any test work, but is it right is the real question and we will just have to wait
> an see I guess.
> 
> >
> >>>
> >>> Lcore j's message obviates lcore i's message, and it would be wasted
> >>> work
> >> for lcore k to process it, so we should mark it to be skipped over.
> Handling all
> >> the edge cases was more complex than the solution proposed.
> >>
> >> Hmmm, to me it seems simple here as long as the lcores follow the
> >> same rules and sharing a timer structure is very risky and avoidable IMO.
> >>
> >> Once you have lcores adding timers to another lcore then all accesses
> >> to that skip list must be serialized or you get unpredictable
> >> results. This should also fix most of the edge cases you are talking about.
> >>
> >> Also it seems to me the case with an lcore adding timers to another
> >> lcore timer list is a specific use case and could be handled by a
> >> different set of APIs for that specific use case. Then we do not need
> >> to change the current design and all of the overhead is placed on the
> >> new APIs/design. IMO we are turning the current timer design into a
> >> global timer design as it really is a per lcore design today and I beleive that
> is a mistake.
> >>
> >
> > Well, the original API explicitly supports installing a timer to be executed on
> a different lcore, and there are no API changes in the patchset.  Also, the
> proposed design keeps the per-lcore design intact;  it only takes what used
> to be one large skiplist that held timers for all installing lcores, and separates
> it into N skiplists that correspond 1:1 to an installing lcore.  When an lcore
> processes timers on its lists it will still only be managing timers it owns, and no
> others.
> 
> 
> Having an API to explicitly support some feature is not a reason to keep
> something, but I think you have reduce my twitching some :-) so I will let it
> go.
> 
> Thanks for the information.

You're welcome, and thank you for the feedback.

Regards,
Gabriel

> 
> >
> >
> >>>
> >>>>>
> >>>>> Gabriel Carrillo (3):
> >>>>> timer: add per-installer pending lists for each lcore
> >>>>> timer: handle timers installed from non-EAL threads
> >>>>> doc: update timer lib docs
> >>>>>
> >>>>> doc/guides/prog_guide/timer_lib.rst |  19 ++-
> >>>>> lib/librte_timer/rte_timer.c        | 329 +++++++++++++++++++++++---
> ---
> >> ---
> >>>> ----
> >>>>> lib/librte_timer/rte_timer.h        |   9 +-
> >>>>> 3 files changed, 231 insertions(+), 126 deletions(-)
> >>>>>
> >>>>> --
> >>>>> 2.6.4
> >>>>>
> >>>>
> >>>> Regards,
> >>>> Keith
> >>
> >> Regards,
> >> Keith
> 
> Regards,
> Keith



More information about the dev mailing list