[v2] ipsec: optimize with c11 atomic for sa outbound sqn update
Checks
Commit Message
For SA outbound packets, rte_atomic64_add_return is used to generate
SQN atomically. This introduced an unnecessary full barrier by calling
the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
patch optimized it with c11 atomic and eliminated the expensive barrier
for aarch64.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
v2:
split from the "generic rte atomic APIs deprecate proposal" patchset.
lib/librte_ipsec/ipsec_sqn.h | 3 ++-
lib/librte_ipsec/meson.build | 5 +++++
lib/librte_ipsec/sa.h | 2 +-
3 files changed, 8 insertions(+), 2 deletions(-)
Comments
On Thu, Apr 23, 2020 at 10:47 PM Phil Yang <phil.yang@arm.com> wrote:
>
> For SA outbound packets, rte_atomic64_add_return is used to generate
> SQN atomically. This introduced an unnecessary full barrier by calling
> the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> patch optimized it with c11 atomic and eliminated the expensive barrier
> for aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> index fc69970..9335f28 100644
> --- a/lib/librte_ipsec/meson.build
> +++ b/lib/librte_ipsec/meson.build
> @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 'ipsec_sad.c')
> headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 'rte_ipsec_sad.h')
>
> deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> +
> +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> + ext_deps += cc.find_library('atomic')
> +endif
The following patch has been merged in master now. You don't need this anymore.
commit da4eae278b56e698c64d0c39939a7a55c5b6abdd
Author: Pavan Nikhilesh <pbhagavatula@marvell.com>
Date: Sun Apr 19 15:31:01 2020 +0530
build: add global libatomic dependency for 32-bit clang
Add libatomic as a global dependency when compiling for 32-bit using
clang. As we need libatomic for 64-bit atomic ops.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>
> For SA outbound packets, rte_atomic64_add_return is used to generate
> SQN atomically. This introduced an unnecessary full barrier by calling
> the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> patch optimized it with c11 atomic and eliminated the expensive barrier
> for aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> ---
> v2:
> split from the "generic rte atomic APIs deprecate proposal" patchset.
>
>
> lib/librte_ipsec/ipsec_sqn.h | 3 ++-
> lib/librte_ipsec/meson.build | 5 +++++
> lib/librte_ipsec/sa.h | 2 +-
> 3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
> index 0c2f76a..e884af7 100644
> --- a/lib/librte_ipsec/ipsec_sqn.h
> +++ b/lib/librte_ipsec/ipsec_sqn.h
> @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa, uint32_t *num)
>
> n = *num;
> if (SQN_ATOMIC(sa))
> - sqn = (uint64_t)rte_atomic64_add_return(&sa->sqn.outb.atom, n);
> + sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> + __ATOMIC_RELAXED);
> else {
> sqn = sa->sqn.outb.raw + n;
> sa->sqn.outb.raw = sqn;
> diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> index fc69970..9335f28 100644
> --- a/lib/librte_ipsec/meson.build
> +++ b/lib/librte_ipsec/meson.build
> @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 'ipsec_sad.c')
> headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 'rte_ipsec_sad.h')
>
> deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> +
> +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> + ext_deps += cc.find_library('atomic')
> +endif
> diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
> index d22451b..cab9a2e 100644
> --- a/lib/librte_ipsec/sa.h
> +++ b/lib/librte_ipsec/sa.h
> @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
> */
> union {
> union {
> - rte_atomic64_t atom;
> + uint64_t atom;
> uint64_t raw;
> } outb;
> struct {
Seems you missed my comments for previous version, so I put here:
If we don't need rte_atomic64 here anymore,
then I think we can collapse the union to just:
uint64_t outb;
Konstantin
> -----Original Message-----
> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Sent: Friday, April 24, 2020 2:11 AM
> To: Phil Yang <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; Iremonger, Bernard
> <bernard.iremonger@intel.com>; Medvedkin, Vladimir
> <vladimir.medvedkin@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu <Gavin.Hu@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH v2] ipsec: optimize with c11 atomic for sa outbound sqn
> update
>
> >
> > For SA outbound packets, rte_atomic64_add_return is used to generate
> > SQN atomically. This introduced an unnecessary full barrier by calling
> > the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> > patch optimized it with c11 atomic and eliminated the expensive barrier
> > for aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> > ---
> > v2:
> > split from the "generic rte atomic APIs deprecate proposal" patchset.
> >
> >
> > lib/librte_ipsec/ipsec_sqn.h | 3 ++-
> > lib/librte_ipsec/meson.build | 5 +++++
> > lib/librte_ipsec/sa.h | 2 +-
> > 3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
> > index 0c2f76a..e884af7 100644
> > --- a/lib/librte_ipsec/ipsec_sqn.h
> > +++ b/lib/librte_ipsec/ipsec_sqn.h
> > @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa,
> uint32_t *num)
> >
> > n = *num;
> > if (SQN_ATOMIC(sa))
> > - sqn = (uint64_t)rte_atomic64_add_return(&sa-
> >sqn.outb.atom, n);
> > + sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> > + __ATOMIC_RELAXED);
> > else {
> > sqn = sa->sqn.outb.raw + n;
> > sa->sqn.outb.raw = sqn;
> > diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> > index fc69970..9335f28 100644
> > --- a/lib/librte_ipsec/meson.build
> > +++ b/lib/librte_ipsec/meson.build
> > @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c',
> 'ipsec_sad.c')
> > headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h',
> 'rte_ipsec_sad.h')
> >
> > deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> > +
> > +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> > +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> > + ext_deps += cc.find_library('atomic')
> > +endif
> > diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
> > index d22451b..cab9a2e 100644
> > --- a/lib/librte_ipsec/sa.h
> > +++ b/lib/librte_ipsec/sa.h
> > @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
> > */
> > union {
> > union {
> > - rte_atomic64_t atom;
> > + uint64_t atom;
> > uint64_t raw;
> > } outb;
> > struct {
>
> Seems you missed my comments for previous version, so I put here:
>
> If we don't need rte_atomic64 here anymore,
> then I think we can collapse the union to just:
> uint64_t outb;
My bad, I missed this comment.
Updated in v3. Please review it.
Thanks,
Phil
>
> Konstantin
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Friday, April 24, 2020 1:45 AM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dpdk-dev
> <dev@dpdk.org>; thomas@monjalon.net; Bernard Iremonger
> <bernard.iremonger@intel.com>; Vladimir Medvedkin
> <vladimir.medvedkin@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu <Gavin.Hu@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2] ipsec: optimize with c11 atomic for sa
> outbound sqn update
>
> On Thu, Apr 23, 2020 at 10:47 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > For SA outbound packets, rte_atomic64_add_return is used to generate
> > SQN atomically. This introduced an unnecessary full barrier by calling
> > the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> > patch optimized it with c11 atomic and eliminated the expensive barrier
> > for aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
>
> > diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> > index fc69970..9335f28 100644
> > --- a/lib/librte_ipsec/meson.build
> > +++ b/lib/librte_ipsec/meson.build
> > @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c',
> 'ipsec_sad.c')
> > headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h',
> 'rte_ipsec_sad.h')
> >
> > deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> > +
> > +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> > +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> > + ext_deps += cc.find_library('atomic')
> > +endif
>
>
> The following patch has been merged in master now. You don't need this
> anymore.
>
> commit da4eae278b56e698c64d0c39939a7a55c5b6abdd
> Author: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Date: Sun Apr 19 15:31:01 2020 +0530
>
> build: add global libatomic dependency for 32-bit clang
>
> Add libatomic as a global dependency when compiling for 32-bit using
> clang. As we need libatomic for 64-bit atomic ops.
>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Great, we don't need to add it module by module anymore.
Updated in v3. Thank you very much.
Thanks,
Phil
@@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa, uint32_t *num)
n = *num;
if (SQN_ATOMIC(sa))
- sqn = (uint64_t)rte_atomic64_add_return(&sa->sqn.outb.atom, n);
+ sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
+ __ATOMIC_RELAXED);
else {
sqn = sa->sqn.outb.raw + n;
sa->sqn.outb.raw = sqn;
@@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 'ipsec_sad.c')
headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 'rte_ipsec_sad.h')
deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
+
+# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
+if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
+ ext_deps += cc.find_library('atomic')
+endif
@@ -120,7 +120,7 @@ struct rte_ipsec_sa {
*/
union {
union {
- rte_atomic64_t atom;
+ uint64_t atom;
uint64_t raw;
} outb;
struct {