[dpdk-stable] [PATCH v1 1/3] test/ring: reduce iteration numbers to make test duration shorter
Ananyev, Konstantin
konstantin.ananyev at intel.com
Fri Jan 22 14:15:38 CET 2021
> When testing ring performance in the case that multiple lcores are mapped to
> the same physical core, e.g. --lcores '(0-3)@10', it takes a very long time
> to wait for the "enqueue_dequeue_bulk_helper" to finish. This is because
> too much iteration numbers and extremely low efficiency for enqueue and
> dequeue with this kind of core mapping. Following are the test results to
> show the above phenomenon:
>
> x86-Intel(R) Xeon(R) Gold 6240:
> $sudo ./app/test/dpdk-test --lcores '(0-1)@25'
> Testing using two hyperthreads(bulk (size: 8):)
> iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
> run time: 7s 7s 7s 8s 9s 16s 47s 170s 660s >0.5h >1h
> legacy APIs: SP/SC: 37 11 6 40525 40525 40209 40367 40407 40541 NoData NoData
> legacy APIs: MP/MC: 56 14 11 50657 40526 40526 40526 40625 40585 NoData NoData
>
> aarch64-n1sdp:
> $sudo ./app/test/dpdk-test --lcore '(0-1)@1'
> Testing using two hyperthreads(bulk (size: 8):)
> iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
> run time: 8s 8s 8s 9s 9s 14s 34s 111s 418s 25min >1h
> legacy APIs: SP/SC: 0.4 0.2 0.1 488 488 488 488 488 489 489 NoData
> legacy APIs: MP/MC: 0.4 0.3 0.2 488 488 488 488 490 489 489 NoData
>
> As the number of iterations increases, so does the time which is required to
> run the program. Currently (iter_shift = 23), it will take more than 1 hour
> to wait for the test to finish. To fix this, the "iter_shift" should decrease
> and ensure enough iterations to keep the test data stable. In order to achieve
> this, we also test with "-l" EAL argument:
>
> x86-Intel(R) Xeon(R) Gold 6240:
> $sudo ./app/test/dpdk-test -l 25-26
> Testing using two NUMA nodes(bulk (size: 8):)
> iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
> run time: 6s 6s 6s 6s 6s 6s 6s 7s 8s 11s 27s
> legacy APIs: SP/SC: 47 20 13 22 54 83 91 73 81 75 95
> legacy APIs: MP/MC: 44 18 18 240 245 270 250 249 252 250 253
>
> aarch64-n1sdp:
> $sudo ./app/test/dpdk-test -l 1-2
> Testing using two physical cores(bulk (size: 8):)
> iter_shift: 3 5 7 9 11 13 *15 17 19 21 23
> run time: 8s 8s 8s 8s 8s 8s 8s 9s 9s 11s 23s
> legacy APIs: SP/SC: 0.7 0.4 1.2 1.8 2.0 2.0 2.0 2.0 2.0 2.0 2.0
> legacy APIs: MP/MC: 0.3 0.4 1.3 1.9 2.9 2.9 2.9 2.9 2.9 2.9 2.9
>
> According to above test data, when "iter_shift" is set as "15", the test run
> time is reduced to less than 1 minute and the test result can keep stable
> in x86 and aarch64 servers.
>
> Fixes: 1fa5d0099efc ("test/ring: add custom element size performance tests")
> Cc: honnappa.nagarahalli at arm.com
> Cc: stable at dpdk.org
>
> Signed-off-by: Feifei Wang <feifei.wang2 at arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> ---
> app/test/test_ring_perf.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
> index e63e25a86..fd82e2041 100644
> --- a/app/test/test_ring_perf.c
> +++ b/app/test/test_ring_perf.c
> @@ -178,7 +178,7 @@ enqueue_dequeue_bulk_helper(const unsigned int flag, const int esize,
> struct thread_params *p)
> {
> int ret;
> - const unsigned int iter_shift = 23;
> + const unsigned int iter_shift = 15;
> const unsigned int iterations = 1 << iter_shift;
> struct rte_ring *r = p->r;
> unsigned int bsize = p->size;
> --
I think it would be better to rework the test(s)
to terminate after some timeout (30s or so), and report number of ops per timeout.
Anyway, as a short term fix, I am ok with it.
Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
> 2.17.1
More information about the stable
mailing list