[PATCH v2 2/2] eal: remove NUMFLAGS enumeration
Ferruh Yigit
ferruh.yigit at amd.com
Wed Sep 27 16:09:29 CEST 2023
On 9/27/2023 2:48 PM, Stanisław Kardach wrote:
> On Wed, Sep 27, 2023 at 1:55 PM Ferruh Yigit <ferruh.yigit at amd.com> wrote:
>>
>> On 9/21/2023 3:49 PM, Stanisław Kardach wrote:
>>> On Thu, Sep 21, 2023, 15:18 Tummala, Sivaprasad
>>> <Sivaprasad.Tummala at amd.com <mailto:Sivaprasad.Tummala at amd.com>> wrote:
>>>
>>> [AMD Official Use Only - General]
>>>
>>> > -----Original Message-----
>>> > From: David Marchand <david.marchand at redhat.com
>>> <mailto:david.marchand at redhat.com>>
>>> > Sent: Wednesday, September 20, 2023 1:05 PM
>>> > To: Stanisław Kardach <kda at semihalf.com
>>> <mailto:kda at semihalf.com>>; Tummala, Sivaprasad
>>> > <Sivaprasad.Tummala at amd.com <mailto:Sivaprasad.Tummala at amd.com>>
>>> > Cc: Ruifeng Wang <ruifeng.wang at arm.com
>>> <mailto:ruifeng.wang at arm.com>>; Min Zhou <zhoumin at loongson.cn
>>> <mailto:zhoumin at loongson.cn>>;
>>> > David Christensen <drc at linux.vnet.ibm.com
>>> <mailto:drc at linux.vnet.ibm.com>>; Bruce Richardson
>>> > <bruce.richardson at intel.com <mailto:bruce.richardson at intel.com>>;
>>> Konstantin Ananyev
>>> > <konstantin.v.ananyev at yandex.ru
>>> <mailto:konstantin.v.ananyev at yandex.ru>>; dev <dev at dpdk.org
>>> <mailto:dev at dpdk.org>>; Yigit, Ferruh
>>> > <Ferruh.Yigit at amd.com <mailto:Ferruh.Yigit at amd.com>>; Thomas
>>> Monjalon <thomas at monjalon.net <mailto:thomas at monjalon.net>>
>>> > Subject: Re: [PATCH v2 2/2] eal: remove NUMFLAGS enumeration
>>> >
>>> > Caution: This message originated from an External Source. Use
>>> proper caution
>>> > when opening attachments, clicking links, or responding.
>>> >
>>> >
>>> > On Wed, Sep 20, 2023 at 8:01 AM Stanisław Kardach
>>> <kda at semihalf.com <mailto:kda at semihalf.com>> wrote:
>>> > >
>>> > > On Tue, Sep 19, 2023 at 4:47 PM David Marchand
>>> > <david.marchand at redhat.com <mailto:david.marchand at redhat.com>> wrote:
>>> > > <snip>
>>> > > > > Also I see you're still removing the RTE_CPUFLAG_NUMFLAGS
>>> (what I call a
>>> > last element canary). Why? If you're concerned with ABI, then
>>> we're talking about
>>> > an application linking dynamically with DPDK or talking via some
>>> RPC channel with
>>> > another DPDK application. So clashing with this definition does
>>> not come into
>>> > question. One should rather use rte_cpu_get_flag_enabled().
>>> > > > > Also if you want to introduce new features, one would add
>>> them yo the
>>> > rte_cpuflags headers, unless you'd like to not add those and keep an
>>> > undocumented list "above" the last defined element.
>>> > > > > Could you explain a bit more Your use-case?
>>> > > >
>>> > > > Hey Stanislaw,
>>> > > >
>>> > > > Talking generically, one problem with such pattern (having a LAST,
>>> > > > or MAX enum) is when an array sized with such a symbol is exposed.
>>> > > > As I mentionned in the past, this can have unwanted effects:
>>> > > >
>>> https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493
>>> <https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493>
>>> > > > -1-david.marchand at redhat.com/
>>> <http://1-david.marchand@redhat.com/>
>>> >
>>> > Argh... who broke copy/paste in my browser ?!
>>> > Wrt to MAX and arrays, I wanted to point at:
>>> >
>>> http://inbox.dpdk.org/dev/CAJFAV8xs5CVdE2xwRtaxk5vE_PiQMV5LY5tKStk3R1gOuR <http://inbox.dpdk.org/dev/CAJFAV8xs5CVdE2xwRtaxk5vE_PiQMV5LY5tKStk3R1gOuR>
>>> > TsUw at mail.gmail.com/ <http://TsUw@mail.gmail.com/>
>>> >
>>> > > I agree, though I'd argue "LAST" and "MAX" semantics are a bit
>>> different. "LAST"
>>> > delimits the known enumeration territory while "MAX" is more of a
>>> `constepxr`
>>> > value type.
>>> > > >
>>> > > > Another issue is when an existing enum meaning changes: from the
>>> > > > application pov, the (old) MAX value is incorrect, but for the
>>> > > > library pov, a new meaning has been associated.
>>> > > > This may trigger bugs in the application when calling a function
>>> > > > that returns such an enum which never return this MAX value in
>>> the past.
>>> > > >
>>> > > > For at least those two reasons, removing those canary elements is
>>> > > > being done in DPDK.
>>> > > >
>>> > > > This specific removal has been announced:
>>> > > >
>>> https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493
>>> <https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493>
>>> > > > -1-david.marchand at redhat.com/
>>> <http://1-david.marchand@redhat.com/>
>>> > > Thanks for pointing this out but did you mean to link to the
>>> patch again here?
>>> >
>>> > Sorry, same here, bad copy/paste :-(.
>>> >
>>> > The intended link is:
>>> https://git.dpdk.org/dpdk/commit/?id=5da7c13521
>>> <https://git.dpdk.org/dpdk/commit/?id=5da7c13521>
>>> > The deprecation notice was badly formulated and this patch here is
>>> consistent with
>>> > it.
>>> >
>>> >
>>> > > >
>>> > > > Now, practically, when I look at the cpuflags API, I don't see us
>>> > > > exposed to those two issues wrt rte_cpu_flag_t, so maybe this
>>> change
>>> > > > is unneeded.
>>> > > > But on the other hand, is it really an issue for an application to
>>> > > > lose this (internal) information?
>>> > > I doubt it, maybe it could be used as a sanity check for
>>> choosing proper functors
>>> > in the application. Though the initial description of the reason
>>> behind this patch was
>>> > to not break the ABI and I don't think it does that. What it does
>>> is enforces users to
>>> > use explicit cpu flag values which is a good thing. Though if so,
>>> then it should be
>>> > stated in the commit description.
>>> >
>>> > I agree.
>>> > Siva, can you work on a new revision?
>>> >
>>> David, Stanislaw,
>>>
>>> The original motivation of this patch was to avoid ABI breakage with
>>> the introduction of new CPU flag
>>> "RTE_CPUFLAG_MONITORX"
>>> (http://mails.dpdk.org/archives/test-report/2023-April/382489.html
>>> <http://mails.dpdk.org/archives/test-report/2023-April/382489.html>).
>>>
>>> Because of ABI breakage, the feature was postponed to this release.
>>> https://patchwork.dpdk.org/project/dpdk/patch/20230413115334.43172-3-sivaprasad.tummala@amd.com/ <https://patchwork.dpdk.org/project/dpdk/patch/20230413115334.43172-3-sivaprasad.tummala@amd.com/>
>>>
>>> This test is flawed, reason being that the NUMFLAGS should not be
>>> treated as a flag value and instead as a canary but this test is not
>>> taking into account.
>>>
>>
>> Hi Stanislaw,
>>
>> Why test is flawed?
>>
>> The enum in in the public header, so the 'RTE_CPUFLAG_NUMFLAGS' enum
>> item, and there are APIs using the enum, so the enum exchanged between
>> shared library and the application.
> In a similar way lots of Linux uapi headers contain bits that should
> not be used directly, even though they are defined there. The reason
> for that is the C language syntax, not necessarily the intent of a
> developer.
> Since NUMFLAGS was a canary to make the flag handling code easier, it
> should not be treated as a "real" value and hence my suggestion of a
> flawed test. That said, NUMFLAGS does not bring enough value to not
> remove it. :)
>
Both it doesn't enough value to hang on, and we don't have control on
how it is used by the application once it is exposed by the library.
>>
>> Similar thing discussed before and when enum exchanged between
>> application and shared library, there is an ABI breakage risk when enum
>> extended and general tendency is to eliminate the MAX value to reduce
>> the risk.
> Agreed though as I have mentioned before, "MAX" has a different
> semantics than "NUM". Then again since we have rte_cpu_feature_table,
> we can RTE_DIM to check the user input.
>
Their usage and intention on having them is same I think, can you please
elaborate what is the difference between MAX and NUM enum items that is
added as last item in an enum?
>>
>>
>> When enum value sent from library to application, it is more clear that
>> this can cause an ABI breakage, because application can receive a value
>> that it is not aware in the build time, which can cause unexpected behavior.
>> Simply think about a case application allocated array in
>> 'RTE_CPUFLAG_NUMFLAGS' size and directly accessing the array index based
>> on returned enum item value, if the enum extended in the new version of
>> the shared library, this can cause invalid memory access in application.
> Using the NUM enum element (which serves as a last item canary) to
> size an array is not a good idea unless it's returned from a runtime
> call. Otherwise one hits issues that you've described.
>
I agree :), but that is a way to describe how it can be a problem.
Also last time I argued similar to what you said, that application
should check against MAX value before using it but I have been told
not to assume what application does. My take from it is, expect worst
from application as a library side developer.
>>
>> When enum value sent from application to library, I am not quite sure
>> how problematic it is to be honest. Like being in the
>> 'rte_cpu_get_flag_enabled()' & 'rte_cpu_get_flag_name()' in question.
>> Only when application sends 'RTE_CPUFLAG_NUMFLAGS' to
>> 'rte_cpu_get_flag_name()', it expects a NULL returned, but this won't
>> happen in new version of the shared library, not sure if this can cause
>> any problem for the application.
>> But as I mentioned, general guidance is to eliminate this kind of MAX
>> enum value usage.
>>
>>
>> And for this specific issue, although usage of the enum in
>> 'rte_cpu_get_flag_enabled()' & 'rte_cpu_get_flag_name()' APIs is not
>> clear if it cause ABI breakage,
>> enum being embedded into the 'struct rte_bbdev_driver_info' struct
>> doesn't leave a question, since this struct is returned from library to
>> the application and change in the enum causes an ABI breakage.
> Enum size does not change irrespective of changing its values. So
> size-wise it's not an ABI breakage. Re-ordering values is an ABI
> breakage.>
Agree it is not size-wise issue. But still an issue.
>>
>>
>> Briefly, I think even appending to the end of 'enum rte_cpu_flag_t'
>> cause ABI breakage and removing 'RTE_CPUFLAG_NUMFLAGS' helps to extend
>> this enum in the future.
>> And an outstanding deprecation notice already exists for this:
>> https://git.dpdk.org/dpdk/tree/doc/guides/rel_notes/deprecation.rst?h=v23.07#n63
>>
>>
>>> Your change did not break the ABI because you have properly added the
>>> new flag at the end.
>>> So I would ask to change the commit description to mention that NUMFLAGS
>>> is removed to:
>>> 1. Prevent users from treating it as a usable value or an array size.
>>> 2. Prevent false-positive failures in the ABI test.
>>>
>>> Also it would be good to link to the aforementioned ABI test failure to
>>> give readers some context when inspecting the git tree.
>>>
>>>
>>>
>>> Can you please add what exactly needs to be reworked in the new version.
>>>
>>> >
>>> > Thanks.
>>> >
>>> > --
>>> > David Marchand
>>>
>>
>
>
More information about the dev
mailing list