[dpdk-stable] [PATCH v1 1/3] test/ring: reduce iteration numbers to make test duration shorter

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Jan 22 14:15:38 CET 2021

Previous message (by thread): [dpdk-stable] [PATCH] examples/pipeline: fix VXLAN script permission
Next message (by thread): [dpdk-stable] 回复: [PATCH v1 1/3] test/ring: reduce iteration numbers to make test duration shorter
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> When testing ring performance in the case that multiple lcores are mapped to
> the same physical core, e.g. --lcores '(0-3)@10', it takes a very long time
> to wait for the "enqueue_dequeue_bulk_helper" to finish. This is because
> too much iteration numbers and extremely low efficiency for enqueue and
> dequeue with this kind of core mapping. Following are the test results to
> show the above phenomenon:
> 
> x86-Intel(R) Xeon(R) Gold 6240:
> $sudo ./app/test/dpdk-test --lcores '(0-1)@25'
> Testing using two hyperthreads(bulk (size: 8):)
> iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> run time:           7s    7s    7s    8s    9s     16s    47s    170s   660s   >0.5h   >1h
> legacy APIs: SP/SC: 37    11    6     40525 40525  40209  40367  40407  40541  NoData  NoData
> legacy APIs: MP/MC: 56    14    11    50657 40526  40526  40526  40625  40585  NoData  NoData
> 
> aarch64-n1sdp:
> $sudo ./app/test/dpdk-test --lcore '(0-1)@1'
> Testing using two hyperthreads(bulk (size: 8):)
> iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> run time:           8s    8s    8s    9s    9s     14s    34s    111s   418s   25min   >1h
> legacy APIs: SP/SC: 0.4   0.2   0.1   488   488    488    488    488    489    489     NoData
> legacy APIs: MP/MC: 0.4   0.3   0.2   488   488    488    488    490    489    489     NoData
> 
> As the number of iterations increases, so does the time which is required to
> run the program. Currently (iter_shift = 23), it will take more than 1 hour
> to wait for the test to finish. To fix this, the "iter_shift" should decrease
> and ensure enough iterations to keep the test data stable. In order to achieve
> this, we also test with "-l" EAL argument:
> 
> x86-Intel(R) Xeon(R) Gold 6240:
> $sudo ./app/test/dpdk-test -l 25-26
> Testing using two NUMA nodes(bulk (size: 8):)
> iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> run time:           6s    6s    6s    6s    6s     6s     6s     7s     8s     11s     27s
> legacy APIs: SP/SC: 47    20    13    22    54     83     91     73     81     75      95
> legacy APIs: MP/MC: 44    18    18    240   245    270    250    249    252    250     253
> 
> aarch64-n1sdp:
> $sudo ./app/test/dpdk-test -l 1-2
> Testing using two physical cores(bulk (size: 8):)
> iter_shift:         3     5     7     9     11     13    *15     17     19     21      23
> run time:           8s    8s    8s    8s    8s     8s     8s     9s     9s     11s     23s
> legacy APIs: SP/SC: 0.7   0.4   1.2   1.8   2.0    2.0    2.0    2.0    2.0    2.0     2.0
> legacy APIs: MP/MC: 0.3   0.4   1.3   1.9   2.9    2.9    2.9    2.9    2.9    2.9     2.9
> 
> According to above test data, when "iter_shift" is set as "15", the test run
> time is reduced to less than 1 minute and the test result can keep stable
> in x86 and aarch64 servers.
> 
> Fixes: 1fa5d0099efc ("test/ring: add custom element size performance tests")
> Cc: honnappa.nagarahalli at arm.com
> Cc: stable at dpdk.org
> 
> Signed-off-by: Feifei Wang <feifei.wang2 at arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> ---
>  app/test/test_ring_perf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
> index e63e25a86..fd82e2041 100644
> --- a/app/test/test_ring_perf.c
> +++ b/app/test/test_ring_perf.c
> @@ -178,7 +178,7 @@ enqueue_dequeue_bulk_helper(const unsigned int flag, const int esize,
>  	struct thread_params *p)
>  {
>  	int ret;
> -	const unsigned int iter_shift = 23;
> +	const unsigned int iter_shift = 15;
>  	const unsigned int iterations = 1 << iter_shift;
>  	struct rte_ring *r = p->r;
>  	unsigned int bsize = p->size;
> --

I think it would be better to rework the test(s)
to terminate after some timeout (30s or so), and report number of ops per timeout. 
Anyway, as a short term fix, I am ok with it.
Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>


> 2.17.1

Previous message (by thread): [dpdk-stable] [PATCH] examples/pipeline: fix VXLAN script permission
Next message (by thread): [dpdk-stable] 回复: [PATCH v1 1/3] test/ring: reduce iteration numbers to make test duration shorter
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the stable mailing list