[dpdk-stable] 回复: [PATCH v1 1/3] test/ring: reduce iteration numbers to make test duration shorter

Feifei Wang Feifei.Wang2 at arm.com
Sun Jan 24 10:52:48 CET 2021


Hi, Konstantin

> -----邮件原件-----
> 发件人: Ananyev, Konstantin <konstantin.ananyev at intel.com>
> 发送时间: 2021年1月22日 21:16
> 收件人: Feifei Wang <Feifei.Wang2 at arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; Olivier Matz <olivier.matz at 6wind.com>;
> Gavin Hu <Gavin.Hu at arm.com>
> 抄送: dev at dpdk.org; nd <nd at arm.com>; stable at dpdk.org
> 主题: RE: [PATCH v1 1/3] test/ring: reduce iteration numbers to make test
> duration shorter
> 
> 
> > When testing ring performance in the case that multiple lcores are
> > mapped to the same physical core, e.g. --lcores '(0-3)@10', it takes a
> > very long time to wait for the "enqueue_dequeue_bulk_helper" to
> > finish. This is because too much iteration numbers and extremely low
> > efficiency for enqueue and dequeue with this kind of core mapping.
> > Following are the test results to show the above phenomenon:
> >
> > x86-Intel(R) Xeon(R) Gold 6240:
> > $sudo ./app/test/dpdk-test --lcores '(0-1)@25'
> > Testing using two hyperthreads(bulk (size: 8):)
> > iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> > run time:           7s    7s    7s    8s    9s     16s    47s    170s   660s   >0.5h   >1h
> > legacy APIs: SP/SC: 37    11    6     40525 40525  40209  40367  40407  40541
> NoData  NoData
> > legacy APIs: MP/MC: 56    14    11    50657 40526  40526  40526  40625  40585
> NoData  NoData
> >
> > aarch64-n1sdp:
> > $sudo ./app/test/dpdk-test --lcore '(0-1)@1'
> > Testing using two hyperthreads(bulk (size: 8):)
> > iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> > run time:           8s    8s    8s    9s    9s     14s    34s    111s   418s   25min   >1h
> > legacy APIs: SP/SC: 0.4   0.2   0.1   488   488    488    488    488    489    489
> NoData
> > legacy APIs: MP/MC: 0.4   0.3   0.2   488   488    488    488    490    489    489
> NoData
> >
> > As the number of iterations increases, so does the time which is
> > required to run the program. Currently (iter_shift = 23), it will take
> > more than 1 hour to wait for the test to finish. To fix this, the
> > "iter_shift" should decrease and ensure enough iterations to keep the
> > test data stable. In order to achieve this, we also test with "-l" EAL
> argument:
> >
> > x86-Intel(R) Xeon(R) Gold 6240:
> > $sudo ./app/test/dpdk-test -l 25-26
> > Testing using two NUMA nodes(bulk (size: 8):)
> > iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> > run time:           6s    6s    6s    6s    6s     6s     6s     7s     8s     11s     27s
> > legacy APIs: SP/SC: 47    20    13    22    54     83     91     73     81     75      95
> > legacy APIs: MP/MC: 44    18    18    240   245    270    250    249    252    250
> 253
> >
> > aarch64-n1sdp:
> > $sudo ./app/test/dpdk-test -l 1-2
> > Testing using two physical cores(bulk (size: 8):)
> > iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> > run time:           8s    8s    8s    8s    8s     8s     8s     9s     9s     11s     23s
> > legacy APIs: SP/SC: 0.7   0.4   1.2   1.8   2.0    2.0    2.0    2.0    2.0    2.0     2.0
> > legacy APIs: MP/MC: 0.3   0.4   1.3   1.9   2.9    2.9    2.9    2.9    2.9    2.9     2.9
> >
> > According to above test data, when "iter_shift" is set as "15", the
> > test run time is reduced to less than 1 minute and the test result can
> > keep stable in x86 and aarch64 servers.
> >
> > Fixes: 1fa5d0099efc ("test/ring: add custom element size performance
> > tests")
> > Cc: honnappa.nagarahalli at arm.com
> > Cc: stable at dpdk.org
> >
> > Signed-off-by: Feifei Wang <feifei.wang2 at arm.com>
> > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> > ---
> >  app/test/test_ring_perf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
> > index e63e25a86..fd82e2041 100644
> > --- a/app/test/test_ring_perf.c
> > +++ b/app/test/test_ring_perf.c
> > @@ -178,7 +178,7 @@ enqueue_dequeue_bulk_helper(const unsigned int
> flag, const int esize,
> >  	struct thread_params *p)
> >  {
> >  	int ret;
> > -	const unsigned int iter_shift = 23;
> > +	const unsigned int iter_shift = 15;
> >  	const unsigned int iterations = 1 << iter_shift;
> >  	struct rte_ring *r = p->r;
> >  	unsigned int bsize = p->size;
> > --
> 
> I think it would be better to rework the test(s) to terminate after some
> timeout (30s or so), and report number of ops per timeout.
> Anyway, as a short term fix, I am ok with it.
> Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
Ok, thanks very much.

Best Regards
Feifei
> 
> 
> > 2.17.1



More information about the stable mailing list