[RFC] eal/x86: disable array bounds checks in rte_memcpy_generic with gcc-12
Checks
Commit Message
Gcc 12 adds more array bounds checking (good); but it is not smart
enough to realize that for small fixed sizes, the bigger move options
are not used.
An example is using rte_memcpy() on a RSS key of 40 bytes may trigger
rte_memcpy complaints from rte_mov128 reading past end of input.
In order to keep some of the checks add special case for calls
to rte_memcpy() with fixed size arguments to use the compiler
builtin instead. Don't want to give all the checking for
code that uses rte_memcpy() everywhere.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
lib/eal/x86/include/rte_memcpy.h | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
Comments
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 9 June 2022 00.49
>
> Gcc 12 adds more array bounds checking (good); but it is not smart
> enough to realize that for small fixed sizes, the bigger move options
> are not used.
>
> An example is using rte_memcpy() on a RSS key of 40 bytes may trigger
> rte_memcpy complaints from rte_mov128 reading past end of input.
>
> In order to keep some of the checks add special case for calls
> to rte_memcpy() with fixed size arguments to use the compiler
> builtin instead. Don't want to give all the checking for
> code that uses rte_memcpy() everywhere.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/eal/x86/include/rte_memcpy.h | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/lib/eal/x86/include/rte_memcpy.h
> b/lib/eal/x86/include/rte_memcpy.h
> index 18aa4e43a743..b90cdd8d7326 100644
> --- a/lib/eal/x86/include/rte_memcpy.h
> +++ b/lib/eal/x86/include/rte_memcpy.h
> @@ -27,6 +27,10 @@ extern "C" {
> #pragma GCC diagnostic ignored "-Wstringop-overflow"
> #endif
>
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000)
> +#pragma GCC diagnostic ignored "-Warray-bounds"
> +#endif
> +
> /**
> * Copy bytes from one location to another. The locations must not
> overlap.
> *
> @@ -842,19 +846,21 @@ rte_memcpy_aligned(void *dst, const void *src,
> size_t n)
> return ret;
> }
>
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
> +#pragma GCC diagnostic pop
> +#endif
> +
> static __rte_always_inline void *
> rte_memcpy(void *dst, const void *src, size_t n)
> {
> - if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> + if (__builtin_constant_p(n))
> + return __builtin_memcpy(dst, src, n);
> + else if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> return rte_memcpy_aligned(dst, src, n);
> else
> return rte_memcpy_generic(dst, src, n);
> }
>
> -#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
> -#pragma GCC diagnostic pop
> -#endif
> -
> #ifdef __cplusplus
> }
> #endif
> --
> 2.35.1
>
Very good.
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
While you are at it, would you consider concealing the definition of ALIGNMENT_MASK too? It seems to be leaking out from this header file.
08/06/2022 23:49, Stephen Hemminger пишет:
> Gcc 12 adds more array bounds checking (good); but it is not smart
> enough to realize that for small fixed sizes, the bigger move options
> are not used.
>
> An example is using rte_memcpy() on a RSS key of 40 bytes may trigger
> rte_memcpy complaints from rte_mov128 reading past end of input.
>
> In order to keep some of the checks add special case for calls
> to rte_memcpy() with fixed size arguments to use the compiler
> builtin instead. Don't want to give all the checking for
> code that uses rte_memcpy() everywhere.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/eal/x86/include/rte_memcpy.h | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h
> index 18aa4e43a743..b90cdd8d7326 100644
> --- a/lib/eal/x86/include/rte_memcpy.h
> +++ b/lib/eal/x86/include/rte_memcpy.h
> @@ -27,6 +27,10 @@ extern "C" {
> #pragma GCC diagnostic ignored "-Wstringop-overflow"
> #endif
>
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000)
> +#pragma GCC diagnostic ignored "-Warray-bounds"
> +#endif
> +
> /**
> * Copy bytes from one location to another. The locations must not overlap.
> *
> @@ -842,19 +846,21 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
> return ret;
> }
>
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
> +#pragma GCC diagnostic pop
> +#endif
> +
> static __rte_always_inline void *
> rte_memcpy(void *dst, const void *src, size_t n)
> {
> - if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> + if (__builtin_constant_p(n))
> + return __builtin_memcpy(dst, src, n);
> + else if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> return rte_memcpy_aligned(dst, src, n);
> else
> return rte_memcpy_generic(dst, src, n);
> }
>
> -#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
> -#pragma GCC diagnostic pop
> -#endif
> -
> #ifdef __cplusplus
> }
> #endif
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
On 6/8/2022 11:49 PM, Stephen Hemminger wrote:
> Gcc 12 adds more array bounds checking (good); but it is not smart
> enough to realize that for small fixed sizes, the bigger move options
> are not used.
>
> An example is using rte_memcpy() on a RSS key of 40 bytes may trigger
> rte_memcpy complaints from rte_mov128 reading past end of input.
>
> In order to keep some of the checks add special case for calls
> to rte_memcpy() with fixed size arguments to use the compiler
> builtin instead. Don't want to give all the checking for
> code that uses rte_memcpy() everywhere.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> lib/eal/x86/include/rte_memcpy.h | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h
> index 18aa4e43a743..b90cdd8d7326 100644
> --- a/lib/eal/x86/include/rte_memcpy.h
> +++ b/lib/eal/x86/include/rte_memcpy.h
> @@ -27,6 +27,10 @@ extern "C" {
> #pragma GCC diagnostic ignored "-Wstringop-overflow"
> #endif
>
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000)
> +#pragma GCC diagnostic ignored "-Warray-bounds"
> +#endif
> +
> /**
> * Copy bytes from one location to another. The locations must not overlap.
> *
> @@ -842,19 +846,21 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
> return ret;
> }
>
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
> +#pragma GCC diagnostic pop
> +#endif
> +
> static __rte_always_inline void *
> rte_memcpy(void *dst, const void *src, size_t n)
> {
> - if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> + if (__builtin_constant_p(n))
> + return __builtin_memcpy(dst, src, n);
> + else if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
This patch does two things,
1. Disable "-Warray-bounds" with above pragma to silence compiler warnings.
2. Use compiler builtin for some cases.
Second can impact the performance and not really needed for the build
error, what do you think to split the patch in two, since 1. is simple
change but 2. may require more testing before accepting.
> From: Ferruh Yigit [mailto:ferruh.yigit@xilinx.com]
> Sent: Friday, 10 June 2022 12.13
>
> On 6/8/2022 11:49 PM, Stephen Hemminger wrote:
> > Gcc 12 adds more array bounds checking (good); but it is not smart
> > enough to realize that for small fixed sizes, the bigger move options
> > are not used.
> >
> > An example is using rte_memcpy() on a RSS key of 40 bytes may trigger
> > rte_memcpy complaints from rte_mov128 reading past end of input.
> >
> > In order to keep some of the checks add special case for calls
> > to rte_memcpy() with fixed size arguments to use the compiler
> > builtin instead. Don't want to give all the checking for
> > code that uses rte_memcpy() everywhere.
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > lib/eal/x86/include/rte_memcpy.h | 16 +++++++++++-----
> > 1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/eal/x86/include/rte_memcpy.h
> b/lib/eal/x86/include/rte_memcpy.h
> > index 18aa4e43a743..b90cdd8d7326 100644
> > --- a/lib/eal/x86/include/rte_memcpy.h
> > +++ b/lib/eal/x86/include/rte_memcpy.h
> > @@ -27,6 +27,10 @@ extern "C" {
> > #pragma GCC diagnostic ignored "-Wstringop-overflow"
> > #endif
> >
> > +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000)
> > +#pragma GCC diagnostic ignored "-Warray-bounds"
> > +#endif
> > +
> > /**
> > * Copy bytes from one location to another. The locations must not
> overlap.
> > *
> > @@ -842,19 +846,21 @@ rte_memcpy_aligned(void *dst, const void *src,
> size_t n)
> > return ret;
> > }
> >
> > +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
> > +#pragma GCC diagnostic pop
> > +#endif
> > +
> > static __rte_always_inline void *
> > rte_memcpy(void *dst, const void *src, size_t n)
> > {
> > - if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> > + if (__builtin_constant_p(n))
> > + return __builtin_memcpy(dst, src, n);
> > + else if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
>
> This patch does two things,
>
> 1. Disable "-Warray-bounds" with above pragma to silence compiler
> warnings.
>
> 2. Use compiler builtin for some cases.
>
> Second can impact the performance and not really needed for the build
> error, what do you think to split the patch in two, since 1. is simple
> change but 2. may require more testing before accepting.
Any such testing will be highly compiler dependent.
Do you have any specific compilers in mind, where you see a risk for lower performance?
On 6/10/2022 11:39 AM, Morten Brørup wrote:
> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
>> From: Ferruh Yigit [mailto:ferruh.yigit@xilinx.com]
>> Sent: Friday, 10 June 2022 12.13
>>
>> On 6/8/2022 11:49 PM, Stephen Hemminger wrote:
>>> Gcc 12 adds more array bounds checking (good); but it is not smart
>>> enough to realize that for small fixed sizes, the bigger move options
>>> are not used.
>>>
>>> An example is using rte_memcpy() on a RSS key of 40 bytes may trigger
>>> rte_memcpy complaints from rte_mov128 reading past end of input.
>>>
>>> In order to keep some of the checks add special case for calls
>>> to rte_memcpy() with fixed size arguments to use the compiler
>>> builtin instead. Don't want to give all the checking for
>>> code that uses rte_memcpy() everywhere.
>>>
>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>> ---
>>> lib/eal/x86/include/rte_memcpy.h | 16 +++++++++++-----
>>> 1 file changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/lib/eal/x86/include/rte_memcpy.h
>> b/lib/eal/x86/include/rte_memcpy.h
>>> index 18aa4e43a743..b90cdd8d7326 100644
>>> --- a/lib/eal/x86/include/rte_memcpy.h
>>> +++ b/lib/eal/x86/include/rte_memcpy.h
>>> @@ -27,6 +27,10 @@ extern "C" {
>>> #pragma GCC diagnostic ignored "-Wstringop-overflow"
>>> #endif
>>>
>>> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000)
>>> +#pragma GCC diagnostic ignored "-Warray-bounds"
>>> +#endif
>>> +
>>> /**
>>> * Copy bytes from one location to another. The locations must not
>> overlap.
>>> *
>>> @@ -842,19 +846,21 @@ rte_memcpy_aligned(void *dst, const void *src,
>> size_t n)
>>> return ret;
>>> }
>>>
>>> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
>>> +#pragma GCC diagnostic pop
>>> +#endif
>>> +
>>> static __rte_always_inline void *
>>> rte_memcpy(void *dst, const void *src, size_t n)
>>> {
>>> - if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
>>> + if (__builtin_constant_p(n))
>>> + return __builtin_memcpy(dst, src, n);
>>> + else if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
>>
>> This patch does two things,
>>
>> 1. Disable "-Warray-bounds" with above pragma to silence compiler
>> warnings.
>>
>> 2. Use compiler builtin for some cases.
>>
>> Second can impact the performance and not really needed for the build
>> error, what do you think to split the patch in two, since 1. is simple
>> change but 2. may require more testing before accepting.
>
> Any such testing will be highly compiler dependent.
>
> Do you have any specific compilers in mind, where you see a risk for lower performance?
>
Hi Morten,
My point is possible performance impact, not about any possible risk or
specific compiler version.
The possible performance impact part can be separated to its own patch
and these can be discussed there, independent from gcc12 build error.
@@ -27,6 +27,10 @@ extern "C" {
#pragma GCC diagnostic ignored "-Wstringop-overflow"
#endif
+#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000)
+#pragma GCC diagnostic ignored "-Warray-bounds"
+#endif
+
/**
* Copy bytes from one location to another. The locations must not overlap.
*
@@ -842,19 +846,21 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
return ret;
}
+#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
+#pragma GCC diagnostic pop
+#endif
+
static __rte_always_inline void *
rte_memcpy(void *dst, const void *src, size_t n)
{
- if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
+ if (__builtin_constant_p(n))
+ return __builtin_memcpy(dst, src, n);
+ else if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
return rte_memcpy_aligned(dst, src, n);
else
return rte_memcpy_generic(dst, src, n);
}
-#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 100000)
-#pragma GCC diagnostic pop
-#endif
-
#ifdef __cplusplus
}
#endif