[dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues

Lukasz Wojciechowski l.wojciechow at partner.samsung.com
Sat Sep 26 00:42:22 CEST 2020


Hello David,

Thank you for your review.

W dniu 25.09.2020 o 14:31, David Marchand pisze:
> Hello Lukasz,
>
> On Wed, Sep 23, 2020 at 3:25 PM Lukasz Wojciechowski
> <l.wojciechow at partner.samsung.com> wrote:
>> During review and verification of the patch created by Sarosh Arif:
>> "test_distributor: prevent memory leakages from the pool" I found out
>> that running distributor unit tests multiple times in a row causes fails.
>> So I investigated all the issues I found.
>>
>> There are few synchronization issues that might cause deadlocks
>> or corrupted data. They are fixed with this set of patches for both tests
>> and librte_distributor library.
>>
>> ---
>> v3:
>> * add missing acked and tested by statements from v1
>>
>> v2:
>> * assign NULL to freed mbufs in distributor test
>> * fix handshake check on legacy single distributor
>>       rte_distributor_return_pkt_single()
>> * add patch 7 passing NULL to legacy API calls if no bufs are returned
>> * add patch 8 fixing API documentation
>>
>> Lukasz Wojciechowski (8):
>>    app/test: fix deadlock in distributor test
>>    app/test: synchronize statistics between lcores
>>    app/test: fix freeing mbufs in distributor tests
>>    app/test: collect return mbufs in distributor test
> For these patches, we can use the "test/distributor: " prefix, and we
> then avoid repeating "in distributor test"
Changed
>
>>    distributor: fix missing handshake synchronization
>>    distributor: fix handshake deadlock
>>    distributor: do not use oldpkt when not needed
>>    distributor: align API documentation with code
> Thanks for working on those fixes !
>
> Here is a suggestion:
>
> - we can move this new patch 7 before patch 3 in the series, and
> update the unit test:
>   * by passing NULL to the first call to rte_distributor_get_pkt(),
> there is no need for buf[] array init in handle_work(),
> handle_work_with_free_mbufs() and handle_work_for_shutdown_test(),
>   * at all points of those functions the buf[] array then contains only
> [0, num[ valid entries,
>   * bonus point, this makes the UT check passing NULL oldpkt,
>
> - the former patch 3 is then easier to do since there is no need for
> buf[] array clearing,
>
> This gives the following diff, wdyt?
I reorder patches as you suggested.
I added unit test changes in the same patch that changes distributor 
lib. The changes follow your diff.
This also simplified "fix freeing mbufs" patch.
It's applied in v4.
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index f31b54edf3..b7ab93ecbe 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -65,13 +65,10 @@ handle_work(void *arg)
>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>          struct worker_params *wp = arg;
>          struct rte_distributor *db = wp->dist;
> -       unsigned int count = 0, num = 0;
> +       unsigned int count = 0, num;
>          unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
> -       int i;
>
> -       for (i = 0; i < 8; i++)
> -               buf[i] = NULL;
> -       num = rte_distributor_get_pkt(db, id, buf, buf, num);
> +       num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>          while (!quit) {
>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                  __ATOMIC_ACQ_REL);
> @@ -277,22 +274,17 @@ handle_work_with_free_mbufs(void *arg)
>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>          struct worker_params *wp = arg;
>          struct rte_distributor *d = wp->dist;
> +       unsigned int num;
>          unsigned int i;
> -       unsigned int num = 0;
>          unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
>
> -       for (i = 0; i < 8; i++)
> -               buf[i] = NULL;
> -       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> +       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>          while (!quit) {
>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                  __ATOMIC_ACQ_REL);
> -               for (i = 0; i < num; i++) {
> +               for (i = 0; i < num; i++)
>                          rte_pktmbuf_free(buf[i]);
> -                       buf[i] = NULL;
> -               }
> -               num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, 0);
> +               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>          }
>          __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                          __ATOMIC_ACQ_REL);
> @@ -347,15 +339,13 @@ handle_work_for_shutdown_test(void *arg)
>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>          struct worker_params *wp = arg;
>          struct rte_distributor *d = wp->dist;
> -       unsigned int num = 0;
> +       unsigned int num;
>          unsigned int i;
>          unsigned int zero_id = 0;
>          const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>                          __ATOMIC_RELAXED);
> -       for (i = 0; i < 8; i++)
> -               buf[i] = NULL;
>
> -       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> +       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>
>          zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>          if (id == zero_id && num > 0) {
> @@ -369,12 +359,9 @@ handle_work_for_shutdown_test(void *arg)
>          while (!quit && !(id == zero_id && zero_quit)) {
>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                  __ATOMIC_ACQ_REL);
> -               for (i = 0; i < num; i++) {
> +               for (i = 0; i < num; i++)
>                          rte_pktmbuf_free(buf[i]);
> -                       buf[i] = NULL;
> -               }
> -               num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, 0);
> +               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>
>                  zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>                  if (id == zero_id && num > 0) {
> @@ -392,21 +379,16 @@ handle_work_for_shutdown_test(void *arg)
>                  while (zero_quit)
>                          usleep(100);
>
> -               for (i = 0; i < num; i++) {
> +               for (i = 0; i < num; i++)
>                          rte_pktmbuf_free(buf[i]);
> -                       buf[i] = NULL;
> -               }
> -               num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, 0);
> +               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>
>                  while (!quit) {
>                          __atomic_fetch_add(&worker_stats[id].handled_packets,
>                                          num, __ATOMIC_ACQ_REL);
> -                       for (i = 0; i < num; i++) {
> +                       for (i = 0; i < num; i++)
>                                  rte_pktmbuf_free(buf[i]);
> -                               buf[i] = NULL;
> -                       }
> -                       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> +                       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>                  }
>          }
>          rte_distributor_return_pkt(d, id, buf, num);
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow at partner.samsung.com



More information about the dev mailing list