[v2] ipsec: optimize with c11 atomic for sa outbound sqn update

Message ID 1587662187-28193-1-git-send-email-phil.yang@arm.com (mailing list archive)
State Superseded, archived
Delegated to: akhil goyal
Headers
Series [v2] ipsec: optimize with c11 atomic for sa outbound sqn update |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed
ci/iol-testing success Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Phil Yang April 23, 2020, 5:16 p.m. UTC
  For SA outbound packets, rte_atomic64_add_return is used to generate
SQN atomically. This introduced an unnecessary full barrier by calling
the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
patch optimized it with c11 atomic and eliminated the expensive barrier
for aarch64.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
v2:
split from the "generic rte atomic APIs deprecate proposal" patchset.


 lib/librte_ipsec/ipsec_sqn.h | 3 ++-
 lib/librte_ipsec/meson.build | 5 +++++
 lib/librte_ipsec/sa.h        | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)
  

Comments

Jerin Jacob April 23, 2020, 5:45 p.m. UTC | #1
On Thu, Apr 23, 2020 at 10:47 PM Phil Yang <phil.yang@arm.com> wrote:
>
> For SA outbound packets, rte_atomic64_add_return is used to generate
> SQN atomically. This introduced an unnecessary full barrier by calling
> the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> patch optimized it with c11 atomic and eliminated the expensive barrier
> for aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>

> diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> index fc69970..9335f28 100644
> --- a/lib/librte_ipsec/meson.build
> +++ b/lib/librte_ipsec/meson.build
> @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 'ipsec_sad.c')
>  headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 'rte_ipsec_sad.h')
>
>  deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> +
> +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> +    ext_deps += cc.find_library('atomic')
> +endif


The following patch has been merged in master now. You don't need this anymore.

commit da4eae278b56e698c64d0c39939a7a55c5b6abdd
Author: Pavan Nikhilesh <pbhagavatula@marvell.com>
Date:   Sun Apr 19 15:31:01 2020 +0530

    build: add global libatomic dependency for 32-bit clang

    Add libatomic as a global dependency when compiling for 32-bit using
    clang. As we need libatomic for 64-bit atomic ops.

    Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
    Acked-by: Bruce Richardson <bruce.richardson@intel.com>
  
Ananyev, Konstantin April 23, 2020, 6:10 p.m. UTC | #2
> 
> For SA outbound packets, rte_atomic64_add_return is used to generate
> SQN atomically. This introduced an unnecessary full barrier by calling
> the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> patch optimized it with c11 atomic and eliminated the expensive barrier
> for aarch64.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> ---
> v2:
> split from the "generic rte atomic APIs deprecate proposal" patchset.
> 
> 
>  lib/librte_ipsec/ipsec_sqn.h | 3 ++-
>  lib/librte_ipsec/meson.build | 5 +++++
>  lib/librte_ipsec/sa.h        | 2 +-
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
> index 0c2f76a..e884af7 100644
> --- a/lib/librte_ipsec/ipsec_sqn.h
> +++ b/lib/librte_ipsec/ipsec_sqn.h
> @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa, uint32_t *num)
> 
>  	n = *num;
>  	if (SQN_ATOMIC(sa))
> -		sqn = (uint64_t)rte_atomic64_add_return(&sa->sqn.outb.atom, n);
> +		sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> +			__ATOMIC_RELAXED);
>  	else {
>  		sqn = sa->sqn.outb.raw + n;
>  		sa->sqn.outb.raw = sqn;
> diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> index fc69970..9335f28 100644
> --- a/lib/librte_ipsec/meson.build
> +++ b/lib/librte_ipsec/meson.build
> @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 'ipsec_sad.c')
>  headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 'rte_ipsec_sad.h')
> 
>  deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> +
> +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> +    ext_deps += cc.find_library('atomic')
> +endif
> diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
> index d22451b..cab9a2e 100644
> --- a/lib/librte_ipsec/sa.h
> +++ b/lib/librte_ipsec/sa.h
> @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
>  	 */
>  	union {
>  		union {
> -			rte_atomic64_t atom;
> +			uint64_t atom;
>  			uint64_t raw;
>  		} outb;
>  		struct {

Seems  you missed my comments for previous version, so I put here:

If we don't need rte_atomic64 here anymore,
then I think we can collapse the union to just:
uint64_t outb;

Konstantin
  
Phil Yang April 24, 2020, 4:35 a.m. UTC | #3
> -----Original Message-----
> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Sent: Friday, April 24, 2020 2:11 AM
> To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; Iremonger, Bernard
> <bernard.iremonger@intel.com>; Medvedkin, Vladimir
> <vladimir.medvedkin@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu <Gavin.Hu@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH v2] ipsec: optimize with c11 atomic for sa outbound sqn
> update
> 
> >
> > For SA outbound packets, rte_atomic64_add_return is used to generate
> > SQN atomically. This introduced an unnecessary full barrier by calling
> > the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> > patch optimized it with c11 atomic and eliminated the expensive barrier
> > for aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> > ---
> > v2:
> > split from the "generic rte atomic APIs deprecate proposal" patchset.
> >
> >
> >  lib/librte_ipsec/ipsec_sqn.h | 3 ++-
> >  lib/librte_ipsec/meson.build | 5 +++++
> >  lib/librte_ipsec/sa.h        | 2 +-
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
> > index 0c2f76a..e884af7 100644
> > --- a/lib/librte_ipsec/ipsec_sqn.h
> > +++ b/lib/librte_ipsec/ipsec_sqn.h
> > @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa,
> uint32_t *num)
> >
> >  	n = *num;
> >  	if (SQN_ATOMIC(sa))
> > -		sqn = (uint64_t)rte_atomic64_add_return(&sa-
> >sqn.outb.atom, n);
> > +		sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> > +			__ATOMIC_RELAXED);
> >  	else {
> >  		sqn = sa->sqn.outb.raw + n;
> >  		sa->sqn.outb.raw = sqn;
> > diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> > index fc69970..9335f28 100644
> > --- a/lib/librte_ipsec/meson.build
> > +++ b/lib/librte_ipsec/meson.build
> > @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c',
> 'ipsec_sad.c')
> >  headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h',
> 'rte_ipsec_sad.h')
> >
> >  deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> > +
> > +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> > +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> > +    ext_deps += cc.find_library('atomic')
> > +endif
> > diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
> > index d22451b..cab9a2e 100644
> > --- a/lib/librte_ipsec/sa.h
> > +++ b/lib/librte_ipsec/sa.h
> > @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
> >  	 */
> >  	union {
> >  		union {
> > -			rte_atomic64_t atom;
> > +			uint64_t atom;
> >  			uint64_t raw;
> >  		} outb;
> >  		struct {
> 
> Seems  you missed my comments for previous version, so I put here:
> 
> If we don't need rte_atomic64 here anymore,
> then I think we can collapse the union to just:
> uint64_t outb;
My bad, I missed this comment.
Updated in v3.  Please review it.

Thanks,
Phil

> 
> Konstantin
  
Phil Yang April 24, 2020, 4:49 a.m. UTC | #4
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Friday, April 24, 2020 1:45 AM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dpdk-dev
> <dev@dpdk.org>; thomas@monjalon.net; Bernard Iremonger
> <bernard.iremonger@intel.com>; Vladimir Medvedkin
> <vladimir.medvedkin@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu <Gavin.Hu@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2] ipsec: optimize with c11 atomic for sa
> outbound sqn update
> 
> On Thu, Apr 23, 2020 at 10:47 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > For SA outbound packets, rte_atomic64_add_return is used to generate
> > SQN atomically. This introduced an unnecessary full barrier by calling
> > the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> > patch optimized it with c11 atomic and eliminated the expensive barrier
> > for aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> 
> > diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> > index fc69970..9335f28 100644
> > --- a/lib/librte_ipsec/meson.build
> > +++ b/lib/librte_ipsec/meson.build
> > @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c',
> 'ipsec_sad.c')
> >  headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h',
> 'rte_ipsec_sad.h')
> >
> >  deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> > +
> > +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> > +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> > +    ext_deps += cc.find_library('atomic')
> > +endif
> 
> 
> The following patch has been merged in master now. You don't need this
> anymore.
> 
> commit da4eae278b56e698c64d0c39939a7a55c5b6abdd
> Author: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Date:   Sun Apr 19 15:31:01 2020 +0530
> 
>     build: add global libatomic dependency for 32-bit clang
> 
>     Add libatomic as a global dependency when compiling for 32-bit using
>     clang. As we need libatomic for 64-bit atomic ops.
> 
>     Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>     Acked-by: Bruce Richardson <bruce.richardson@intel.com>

Great, we don't need to add it module by module anymore. 
Updated in v3. Thank you very much.

Thanks,
Phil
  

Patch

diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
index 0c2f76a..e884af7 100644
--- a/lib/librte_ipsec/ipsec_sqn.h
+++ b/lib/librte_ipsec/ipsec_sqn.h
@@ -128,7 +128,8 @@  esn_outb_update_sqn(struct rte_ipsec_sa *sa, uint32_t *num)
 
 	n = *num;
 	if (SQN_ATOMIC(sa))
-		sqn = (uint64_t)rte_atomic64_add_return(&sa->sqn.outb.atom, n);
+		sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
+			__ATOMIC_RELAXED);
 	else {
 		sqn = sa->sqn.outb.raw + n;
 		sa->sqn.outb.raw = sqn;
diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
index fc69970..9335f28 100644
--- a/lib/librte_ipsec/meson.build
+++ b/lib/librte_ipsec/meson.build
@@ -6,3 +6,8 @@  sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 'ipsec_sad.c')
 headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 'rte_ipsec_sad.h')
 
 deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
+
+# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
+if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
+    ext_deps += cc.find_library('atomic')
+endif
diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
index d22451b..cab9a2e 100644
--- a/lib/librte_ipsec/sa.h
+++ b/lib/librte_ipsec/sa.h
@@ -120,7 +120,7 @@  struct rte_ipsec_sa {
 	 */
 	union {
 		union {
-			rte_atomic64_t atom;
+			uint64_t atom;
 			uint64_t raw;
 		} outb;
 		struct {