[dpdk-dev,v2,3/4] net/mlx: version rdma-core glue libraries

Message ID 20180202164050.13017-4-adrien.mazarguil@6wind.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Adrien Mazarguil Feb. 2, 2018, 4:46 p.m. UTC
  When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/Makefile    | 8 ++++++--
 drivers/net/mlx4/mlx4.c      | 5 +++++
 drivers/net/mlx4/mlx4_glue.c | 1 +
 drivers/net/mlx4/mlx4_glue.h | 6 ++++++
 drivers/net/mlx5/Makefile    | 8 ++++++--
 drivers/net/mlx5/mlx5.c      | 5 +++++
 drivers/net/mlx5/mlx5_glue.c | 1 +
 drivers/net/mlx5/mlx5_glue.h | 6 ++++++
 8 files changed, 36 insertions(+), 4 deletions(-)
  

Comments

Thomas Monjalon Feb. 4, 2018, 2:29 p.m. UTC | #1
02/02/2018 17:46, Adrien Mazarguil:
> --- a/drivers/net/mlx4/Makefile
> +++ b/drivers/net/mlx4/Makefile
> @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  
>  # Library name.
>  LIB = librte_pmd_mlx4.a
> -LIB_GLUE = librte_pmd_mlx4_glue.so
> +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> +LIB_GLUE_VERSION = 18.02.1

You should use the version number of the release, i.e. 18.02.0
Ideally, you should retrieve it from rte_version.h.
  
Adrien Mazarguil Feb. 5, 2018, 11:24 a.m. UTC | #2
On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> 02/02/2018 17:46, Adrien Mazarguil:
> > --- a/drivers/net/mlx4/Makefile
> > +++ b/drivers/net/mlx4/Makefile
> > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> >  
> >  # Library name.
> >  LIB = librte_pmd_mlx4.a
> > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > +LIB_GLUE_VERSION = 18.02.1
> 
> You should use the version number of the release, i.e. 18.02.0
> Ideally, you should retrieve it from rte_version.h.

Keep in mind this only needs to be updated when the glue API gets modified,
and this "18.02.1" string may remain unmodified for subsequent DPDK
releases, probably as long as the PMD doesn't use any new rdma-core calls.

We've already backported this patch to 17.02 and 17.11, both requiring
different sets of Verbs calls and thus a different version, hence the added
"18.02" as a starting point. The last digit may have to be modified possibly
several times between official DPDK releases while work is being done on the
PMD (i.e. per commit).

In short it's not meant to follow DPDK's public versioning scheme. If you
really think it should, doing so will make things more complex in the
Makefile, which will have to parse rte_version.h. What's your opinion?
  
Marcelo Ricardo Leitner Feb. 5, 2018, 12:13 p.m. UTC | #3
On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > 02/02/2018 17:46, Adrien Mazarguil:
> > > --- a/drivers/net/mlx4/Makefile
> > > +++ b/drivers/net/mlx4/Makefile
> > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > >  
> > >  # Library name.
> > >  LIB = librte_pmd_mlx4.a
> > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > +LIB_GLUE_VERSION = 18.02.1
> > 
> > You should use the version number of the release, i.e. 18.02.0
> > Ideally, you should retrieve it from rte_version.h.
> 
> Keep in mind this only needs to be updated when the glue API gets modified,
> and this "18.02.1" string may remain unmodified for subsequent DPDK
> releases, probably as long as the PMD doesn't use any new rdma-core calls.
> 
> We've already backported this patch to 17.02 and 17.11, both requiring
> different sets of Verbs calls and thus a different version, hence the added
> "18.02" as a starting point. The last digit may have to be modified possibly
> several times between official DPDK releases while work is being done on the
> PMD (i.e. per commit).
> 
> In short it's not meant to follow DPDK's public versioning scheme. If you
> really think it should, doing so will make things more complex in the
> Makefile, which will have to parse rte_version.h. What's your opinion?

What about appending date +%s output to it? It would be stricter and
automated.

  Marcelo
  
Van Haaren, Harry Feb. 5, 2018, 12:24 p.m. UTC | #4
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> Sent: Monday, February 5, 2018 12:14 PM
> To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> libraries
> 
> On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > --- a/drivers/net/mlx4/Makefile
> > > > +++ b/drivers/net/mlx4/Makefile
> > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > >
> > > >  # Library name.
> > > >  LIB = librte_pmd_mlx4.a
> > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > +LIB_GLUE_VERSION = 18.02.1
> > >
> > > You should use the version number of the release, i.e. 18.02.0
> > > Ideally, you should retrieve it from rte_version.h.
> >
> > Keep in mind this only needs to be updated when the glue API gets
> modified,
> > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> >
> > We've already backported this patch to 17.02 and 17.11, both requiring
> > different sets of Verbs calls and thus a different version, hence the
> added
> > "18.02" as a starting point. The last digit may have to be modified
> possibly
> > several times between official DPDK releases while work is being done on
> the
> > PMD (i.e. per commit).
> >
> > In short it's not meant to follow DPDK's public versioning scheme. If you
> > really think it should, doing so will make things more complex in the
> > Makefile, which will have to parse rte_version.h. What's your opinion?
> 
> What about appending date +%s output to it? It would be stricter and
> automated.

Adding current timestamp or date into a build breaks reproducibility of builds, so is
generally not recommended.

No opinion on string/version naming here.
  
Marcelo Ricardo Leitner Feb. 5, 2018, 12:43 p.m. UTC | #5
On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > Sent: Monday, February 5, 2018 12:14 PM
> > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > libraries
> > 
> > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > --- a/drivers/net/mlx4/Makefile
> > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > >
> > > > >  # Library name.
> > > > >  LIB = librte_pmd_mlx4.a
> > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE_VERSION = 18.02.1
> > > >
> > > > You should use the version number of the release, i.e. 18.02.0
> > > > Ideally, you should retrieve it from rte_version.h.
> > >
> > > Keep in mind this only needs to be updated when the glue API gets
> > modified,
> > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > >
> > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > different sets of Verbs calls and thus a different version, hence the
> > added
> > > "18.02" as a starting point. The last digit may have to be modified
> > possibly
> > > several times between official DPDK releases while work is being done on
> > the
> > > PMD (i.e. per commit).
> > >
> > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > really think it should, doing so will make things more complex in the
> > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > 
> > What about appending date +%s output to it? It would be stricter and
> > automated.
> 
> Adding current timestamp or date into a build breaks reproducibility of builds, so is
> generally not recommended.

Good point.

> 
> No opinion on string/version naming here.
>
  
Marcelo Ricardo Leitner Feb. 5, 2018, 12:58 p.m. UTC | #6
On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > Sent: Monday, February 5, 2018 12:14 PM
> > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > libraries
> > 
> > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > --- a/drivers/net/mlx4/Makefile
> > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > >
> > > > >  # Library name.
> > > > >  LIB = librte_pmd_mlx4.a
> > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE_VERSION = 18.02.1
> > > >
> > > > You should use the version number of the release, i.e. 18.02.0
> > > > Ideally, you should retrieve it from rte_version.h.
> > >
> > > Keep in mind this only needs to be updated when the glue API gets
> > modified,
> > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > >
> > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > different sets of Verbs calls and thus a different version, hence the
> > added
> > > "18.02" as a starting point. The last digit may have to be modified
> > possibly
> > > several times between official DPDK releases while work is being done on
> > the
> > > PMD (i.e. per commit).
> > >
> > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > really think it should, doing so will make things more complex in the
> > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > 
> > What about appending date +%s output to it? It would be stricter and
> > automated.
> 
> Adding current timestamp or date into a build breaks reproducibility of builds, so is
> generally not recommended.

Then the sha1sum of mlx4_glue.h.
With this the size check I mentioned on the other patch would become
redundant and unnecessary.

> 
> No opinion on string/version naming here.
>
  
Adrien Mazarguil Feb. 5, 2018, 1:44 p.m. UTC | #7
On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > Sent: Monday, February 5, 2018 12:14 PM
> > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > libraries
> > > 
> > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > >
> > > > > >  # Library name.
> > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > >
> > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > Ideally, you should retrieve it from rte_version.h.
> > > >
> > > > Keep in mind this only needs to be updated when the glue API gets
> > > modified,
> > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > >
> > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > different sets of Verbs calls and thus a different version, hence the
> > > added
> > > > "18.02" as a starting point. The last digit may have to be modified
> > > possibly
> > > > several times between official DPDK releases while work is being done on
> > > the
> > > > PMD (i.e. per commit).
> > > >
> > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > really think it should, doing so will make things more complex in the
> > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > 
> > > What about appending date +%s output to it? It would be stricter and
> > > automated.
> > 
> > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > generally not recommended.
> 
> Then the sha1sum of mlx4_glue.h.
> With this the size check I mentioned on the other patch would become
> redundant and unnecessary.

Using a strong hash algorithm to version a library/symbol, while possible,
seems a bit overkill and results in ugliness:

 librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3

Using a weak one like CRC32 for a shorter name poses a risk of
collision. Moreover the next time someone decides to update all version
notices or modify a comment will impact that hash. We'd need to isolate the
symbol definition itself, ignore parameter names in function prototypes and
only then we may get a somewhat meaningful hash describing a given ABI.

Given the added complexity, is there really a problem with simple version
numbers we increment every time something gets modified? (Note this is
already how our .map files work, they're not generated automatically)

How about keeping things as is?
  
Thomas Monjalon Feb. 5, 2018, 2:16 p.m. UTC | #8
05/02/2018 14:44, Adrien Mazarguil:
> On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > libraries
> > > > 
> > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > >
> > > > > > >  # Library name.
> > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > >
> > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > >
> > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > modified,
> > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > >
> > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > different sets of Verbs calls and thus a different version, hence the
> > > > added
> > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > possibly
> > > > > several times between official DPDK releases while work is being done on
> > > > the
> > > > > PMD (i.e. per commit).
> > > > >
> > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > really think it should, doing so will make things more complex in the
> > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > 
> > > > What about appending date +%s output to it? It would be stricter and
> > > > automated.
> > > 
> > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > generally not recommended.
> > 
> > Then the sha1sum of mlx4_glue.h.
> > With this the size check I mentioned on the other patch would become
> > redundant and unnecessary.
> 
> Using a strong hash algorithm to version a library/symbol, while possible,
> seems a bit overkill and results in ugliness:
> 
>  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> 
> Using a weak one like CRC32 for a shorter name poses a risk of
> collision. Moreover the next time someone decides to update all version
> notices or modify a comment will impact that hash. We'd need to isolate the
> symbol definition itself, ignore parameter names in function prototypes and
> only then we may get a somewhat meaningful hash describing a given ABI.
> 
> Given the added complexity, is there really a problem with simple version
> numbers we increment every time something gets modified? (Note this is
> already how our .map files work, they're not generated automatically)

Our map files show the major version where a symbol was introduced.
It is simple because no symbol can be introduced in a minor version.

> How about keeping things as is?

You are using 18.02.1 while it is introduced in 18.02.0.
If you don't want to correlate the .so version number with DPDK version
number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
  
Adrien Mazarguil Feb. 5, 2018, 2:33 p.m. UTC | #9
On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> 05/02/2018 14:44, Adrien Mazarguil:
> > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > libraries
> > > > > 
> > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > >
> > > > > > > >  # Library name.
> > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > >
> > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > >
> > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > modified,
> > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > >
> > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > added
> > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > possibly
> > > > > > several times between official DPDK releases while work is being done on
> > > > > the
> > > > > > PMD (i.e. per commit).
> > > > > >
> > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > really think it should, doing so will make things more complex in the
> > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > 
> > > > > What about appending date +%s output to it? It would be stricter and
> > > > > automated.
> > > > 
> > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > generally not recommended.
> > > 
> > > Then the sha1sum of mlx4_glue.h.
> > > With this the size check I mentioned on the other patch would become
> > > redundant and unnecessary.
> > 
> > Using a strong hash algorithm to version a library/symbol, while possible,
> > seems a bit overkill and results in ugliness:
> > 
> >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > 
> > Using a weak one like CRC32 for a shorter name poses a risk of
> > collision. Moreover the next time someone decides to update all version
> > notices or modify a comment will impact that hash. We'd need to isolate the
> > symbol definition itself, ignore parameter names in function prototypes and
> > only then we may get a somewhat meaningful hash describing a given ABI.
> > 
> > Given the added complexity, is there really a problem with simple version
> > numbers we increment every time something gets modified? (Note this is
> > already how our .map files work, they're not generated automatically)
> 
> Our map files show the major version where a symbol was introduced.
> It is simple because no symbol can be introduced in a minor version.
> 
> > How about keeping things as is?
> 
> You are using 18.02.1 while it is introduced in 18.02.0.
> If you don't want to correlate the .so version number with DPDK version
> number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

I don't really care as long as there's no confusion with their backported
counterparts (namely 17.11 and 17.02). I understand the possible confusion
for someone who'd grep the sources though.

If 18.02.0 is OK in everyone's opinion, let's use that. It satisfies the
uniqueness requirement. We'll add a digit or find some other versioning
scheme later if necessary.

Shahaf, can you make a minor adjustment while applying this series?

Both drivers/net/mlx4/Makefile and drivers/net/mlx5/Makefile need to be
modified as follows in patch 3/4:

 -LIB_GLUE_VERSION = 18.02.1
 +LIB_GLUE_VERSION = 18.02.0
  
Marcelo Ricardo Leitner Feb. 5, 2018, 2:37 p.m. UTC | #10
On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> 05/02/2018 14:44, Adrien Mazarguil:
> > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > libraries
> > > > > 
> > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > >
> > > > > > > >  # Library name.
> > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > >
> > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > >
> > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > modified,
> > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > >
> > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > added
> > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > possibly
> > > > > > several times between official DPDK releases while work is being done on
> > > > > the
> > > > > > PMD (i.e. per commit).
> > > > > >
> > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > really think it should, doing so will make things more complex in the
> > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > 
> > > > > What about appending date +%s output to it? It would be stricter and
> > > > > automated.
> > > > 
> > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > generally not recommended.
> > > 
> > > Then the sha1sum of mlx4_glue.h.
> > > With this the size check I mentioned on the other patch would become
> > > redundant and unnecessary.
> > 
> > Using a strong hash algorithm to version a library/symbol, while possible,
> > seems a bit overkill and results in ugliness:
> > 
> >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3

Ugh yes, but it wouldn't need to be that visible. A pointer on
mlx*_glue and a define on PMD would be enough already. As in, an
extended check to the versioning.

> > 
> > Using a weak one like CRC32 for a shorter name poses a risk of
> > collision. Moreover the next time someone decides to update all version
> > notices or modify a comment will impact that hash. We'd need to isolate the
> > symbol definition itself, ignore parameter names in function prototypes and
> > only then we may get a somewhat meaningful hash describing a given ABI.

That's what I meant with stricter. Yes it would catch such
situations, but you tell me on how much we want to protect/restrict
here.  Do you see a reason for building only the dpdk/pmd side and not
the glue library at a time?

> > 
> > Given the added complexity, is there really a problem with simple version
> > numbers we increment every time something gets modified? (Note this is
> > already how our .map files work, they're not generated automatically)
> 
> Our map files show the major version where a symbol was introduced.
> It is simple because no symbol can be introduced in a minor version.
> 
> > How about keeping things as is?

I don't really see the need of unique filenames. The next patch is
already leveraging RTE_EAL_PMD_PATH, which if versioned should be
enough for this, no?

> 
> You are using 18.02.1 while it is introduced in 18.02.0.
> If you don't want to correlate the .so version number with DPDK version
> number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

+1

  Marcelo
  
Adrien Mazarguil Feb. 5, 2018, 2:59 p.m. UTC | #11
On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > 05/02/2018 14:44, Adrien Mazarguil:
> > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > libraries
> > > > > > 
> > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > >
> > > > > > > > >  # Library name.
> > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > >
> > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > >
> > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > modified,
> > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > >
> > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > added
> > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > possibly
> > > > > > > several times between official DPDK releases while work is being done on
> > > > > > the
> > > > > > > PMD (i.e. per commit).
> > > > > > >
> > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > 
> > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > automated.
> > > > > 
> > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > generally not recommended.
> > > > 
> > > > Then the sha1sum of mlx4_glue.h.
> > > > With this the size check I mentioned on the other patch would become
> > > > redundant and unnecessary.
> > > 
> > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > seems a bit overkill and results in ugliness:
> > > 
> > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> 
> Ugh yes, but it wouldn't need to be that visible. A pointer on
> mlx*_glue and a define on PMD would be enough already. As in, an
> extended check to the versioning.

I thought you suggested this as a replacement. I'm not sure we need or want
to go this far. The current string comparison is really not worse than
standard symbol versioning, which doesn't check symbol properties besides
whether they are functions or other objects. We could have used C++ with
automatically mangled symbol names for that, however that again would make
things way more complex than necessary.

> > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > collision. Moreover the next time someone decides to update all version
> > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > symbol definition itself, ignore parameter names in function prototypes and
> > > only then we may get a somewhat meaningful hash describing a given ABI.
> 
> That's what I meant with stricter. Yes it would catch such
> situations, but you tell me on how much we want to protect/restrict
> here.  Do you see a reason for building only the dpdk/pmd side and not
> the glue library at a time?

No, they're always built together. We're only adding this versioning to
avoid issues when users somehow end up with several DPDK versions installed
on their system, or with leftovers of previous releases lying around. That's
all we need to solve here. dlopen()'ing the proper file takes care of that,
the symbol version number check afterward is performed just in case.

> > > Given the added complexity, is there really a problem with simple version
> > > numbers we increment every time something gets modified? (Note this is
> > > already how our .map files work, they're not generated automatically)
> > 
> > Our map files show the major version where a symbol was introduced.
> > It is simple because no symbol can be introduced in a minor version.
> > 
> > > How about keeping things as is?
> 
> I don't really see the need of unique filenames. The next patch is
> already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> enough for this, no?

As you said, "if" versioned. As an undocumented empty string by default,
there's no way to be sure. Leaving the PMD version its internal but
(unfortunately) exposed bits will certainly prevent mistakes.

> > You are using 18.02.1 while it is introduced in 18.02.0.
> > If you don't want to correlate the .so version number with DPDK version
> > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> 
> +1

Then are you fine with the "18.02.0" suffix?
  
Marcelo Ricardo Leitner Feb. 5, 2018, 3:29 p.m. UTC | #12
On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > 05/02/2018 14:44, Adrien Mazarguil:
> > > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > > libraries
> > > > > > > 
> > > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > > >
> > > > > > > > > >  # Library name.
> > > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > > >
> > > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > > >
> > > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > > modified,
> > > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > > >
> > > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > > added
> > > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > > possibly
> > > > > > > > several times between official DPDK releases while work is being done on
> > > > > > > the
> > > > > > > > PMD (i.e. per commit).
> > > > > > > >
> > > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > > 
> > > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > > automated.
> > > > > > 
> > > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > > generally not recommended.
> > > > > 
> > > > > Then the sha1sum of mlx4_glue.h.
> > > > > With this the size check I mentioned on the other patch would become
> > > > > redundant and unnecessary.
> > > > 
> > > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > > seems a bit overkill and results in ugliness:
> > > > 
> > > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > 
> > Ugh yes, but it wouldn't need to be that visible. A pointer on
> > mlx*_glue and a define on PMD would be enough already. As in, an
> > extended check to the versioning.
> 
> I thought you suggested this as a replacement. I'm not sure we need or want
> to go this far. The current string comparison is really not worse than
> standard symbol versioning, which doesn't check symbol properties besides
> whether they are functions or other objects. We could have used C++ with
> automatically mangled symbol names for that, however that again would make
> things way more complex than necessary.
> 
> > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > collision. Moreover the next time someone decides to update all version
> > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > 
> > That's what I meant with stricter. Yes it would catch such
> > situations, but you tell me on how much we want to protect/restrict
> > here.  Do you see a reason for building only the dpdk/pmd side and not
> > the glue library at a time?
> 
> No, they're always built together. We're only adding this versioning to
> avoid issues when users somehow end up with several DPDK versions installed
> on their system, or with leftovers of previous releases lying around. That's
> all we need to solve here. dlopen()'ing the proper file takes care of that,
> the symbol version number check afterward is performed just in case.

Interesting. These leftovers probably wouldn't be there if it wasn't
versioned in the first place. :-)

> 
> > > > Given the added complexity, is there really a problem with simple version
> > > > numbers we increment every time something gets modified? (Note this is
> > > > already how our .map files work, they're not generated automatically)
> > > 
> > > Our map files show the major version where a symbol was introduced.
> > > It is simple because no symbol can be introduced in a minor version.
> > > 
> > > > How about keeping things as is?
> > 
> > I don't really see the need of unique filenames. The next patch is
> > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > enough for this, no?
> 
> As you said, "if" versioned. As an undocumented empty string by default,
> there's no way to be sure. Leaving the PMD version its internal but
> (unfortunately) exposed bits will certainly prevent mistakes.
> 
> > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > If you don't want to correlate the .so version number with DPDK version
> > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > 
> > +1
> 
> Then are you fine with the "18.02.0" suffix?

Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
to dpdk version.

With the latest replies, I don't think the reasoning is enough to
justify these extra checks, but I won't oppose to including it.

  Marcelo
  
Adrien Mazarguil Feb. 5, 2018, 3:54 p.m. UTC | #13
On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > 05/02/2018 14:44, Adrien Mazarguil:
> > > > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > > > libraries
> > > > > > > > 
> > > > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > > > >
> > > > > > > > > > >  # Library name.
> > > > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > > > >
> > > > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > > > >
> > > > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > > > modified,
> > > > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > > > >
> > > > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > > > added
> > > > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > > > possibly
> > > > > > > > > several times between official DPDK releases while work is being done on
> > > > > > > > the
> > > > > > > > > PMD (i.e. per commit).
> > > > > > > > >
> > > > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > > > 
> > > > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > > > automated.
> > > > > > > 
> > > > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > > > generally not recommended.
> > > > > > 
> > > > > > Then the sha1sum of mlx4_glue.h.
> > > > > > With this the size check I mentioned on the other patch would become
> > > > > > redundant and unnecessary.
> > > > > 
> > > > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > > > seems a bit overkill and results in ugliness:
> > > > > 
> > > > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > > 
> > > Ugh yes, but it wouldn't need to be that visible. A pointer on
> > > mlx*_glue and a define on PMD would be enough already. As in, an
> > > extended check to the versioning.
> > 
> > I thought you suggested this as a replacement. I'm not sure we need or want
> > to go this far. The current string comparison is really not worse than
> > standard symbol versioning, which doesn't check symbol properties besides
> > whether they are functions or other objects. We could have used C++ with
> > automatically mangled symbol names for that, however that again would make
> > things way more complex than necessary.
> > 
> > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > collision. Moreover the next time someone decides to update all version
> > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > 
> > > That's what I meant with stricter. Yes it would catch such
> > > situations, but you tell me on how much we want to protect/restrict
> > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > the glue library at a time?
> > 
> > No, they're always built together. We're only adding this versioning to
> > avoid issues when users somehow end up with several DPDK versions installed
> > on their system, or with leftovers of previous releases lying around. That's
> > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > the symbol version number check afterward is performed just in case.
> 
> Interesting. These leftovers probably wouldn't be there if it wasn't
> versioned in the first place. :-)

Seriously, we can't assume users will do everything using neat packages and
may run an unfortunate "make install" from the DPDK source tree without
noticing they wrecked their system. Someone will have to mop the ensuing but
preventable bug reports.

> > > > > Given the added complexity, is there really a problem with simple version
> > > > > numbers we increment every time something gets modified? (Note this is
> > > > > already how our .map files work, they're not generated automatically)
> > > > 
> > > > Our map files show the major version where a symbol was introduced.
> > > > It is simple because no symbol can be introduced in a minor version.
> > > > 
> > > > > How about keeping things as is?
> > > 
> > > I don't really see the need of unique filenames. The next patch is
> > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > enough for this, no?
> > 
> > As you said, "if" versioned. As an undocumented empty string by default,
> > there's no way to be sure. Leaving the PMD version its internal but
> > (unfortunately) exposed bits will certainly prevent mistakes.
> > 
> > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > If you don't want to correlate the .so version number with DPDK version
> > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > 
> > > +1
> > 
> > Then are you fine with the "18.02.0" suffix?
> 
> Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> to dpdk version.
> 
> With the latest replies, I don't think the reasoning is enough to
> justify these extra checks, but I won't oppose to including it.

18.02.0 makes it tied to the current release number, so I guess we agree.
The idea for now is this part remains tied to the DPDK release.

If a new ABI version is needed in a subsequent commit, the initial part gets
bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
intermediate commits break the glue ABI, a fourth digit is added
(e.g. 42.02.0.1).

This role is currently held by the third digit but since there's a confusion
with DPDK revisions, it won't be used internally by the PMD. Hopefully this
fourth digit will remain unused (otherwise I can add as many digits as
necessary to make it acceptable, I'll then re-consider the SHA1 idea :)
  
Marcelo Ricardo Leitner Feb. 5, 2018, 5:06 p.m. UTC | #14
On Mon, Feb 05, 2018 at 04:54:55PM +0100, Adrien Mazarguil wrote:
> On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > > 05/02/2018 14:44, Adrien Mazarguil:
...
> > > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > > collision. Moreover the next time someone decides to update all version
> > > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > > 
> > > > That's what I meant with stricter. Yes it would catch such
> > > > situations, but you tell me on how much we want to protect/restrict
> > > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > > the glue library at a time?
> > > 
> > > No, they're always built together. We're only adding this versioning to
> > > avoid issues when users somehow end up with several DPDK versions installed
> > > on their system, or with leftovers of previous releases lying around. That's
> > > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > > the symbol version number check afterward is performed just in case.
> > 
> > Interesting. These leftovers probably wouldn't be there if it wasn't
> > versioned in the first place. :-)
> 
> Seriously, we can't assume users will do everything using neat packages and
> may run an unfortunate "make install" from the DPDK source tree without
> noticing they wrecked their system. Someone will have to mop the ensuing but
> preventable bug reports.
> 
> > > > > > Given the added complexity, is there really a problem with simple version
> > > > > > numbers we increment every time something gets modified? (Note this is
> > > > > > already how our .map files work, they're not generated automatically)
> > > > > 
> > > > > Our map files show the major version where a symbol was introduced.
> > > > > It is simple because no symbol can be introduced in a minor version.
> > > > > 
> > > > > > How about keeping things as is?
> > > > 
> > > > I don't really see the need of unique filenames. The next patch is
> > > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > > enough for this, no?
> > > 
> > > As you said, "if" versioned. As an undocumented empty string by default,
> > > there's no way to be sure. Leaving the PMD version its internal but
> > > (unfortunately) exposed bits will certainly prevent mistakes.
> > > 
> > > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > > If you don't want to correlate the .so version number with DPDK version
> > > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > > 
> > > > +1
> > > 
> > > Then are you fine with the "18.02.0" suffix?
> > 
> > Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> > to dpdk version.
> > 
> > With the latest replies, I don't think the reasoning is enough to
> > justify these extra checks, but I won't oppose to including it.
> 
> 18.02.0 makes it tied to the current release number, so I guess we agree.

It makes them equal, but not tied. If nobody patches it, when 18.02.1
is out, the glue lib will still be 18.02.0.

> The idea for now is this part remains tied to the DPDK release.
> 
> If a new ABI version is needed in a subsequent commit, the initial part gets
> bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
> intermediate commits break the glue ABI, a fourth digit is added
> (e.g. 42.02.0.1).

I'll defer this to other project developers. This is more about a
project standard than anything here. I could even argue that this glue
should be named after the pmd lib, such as
   ./usr/local/lib/librte_pmd_mlx4_glue.so.1.1
The fact of not providing the _glue.so symlink is enough to avoid
others from linking against it. But it's more of a project standard
than a technical decision, I guess, weather this lib is seen as a
plugin or as a (private) library.

Considering the versioning used for the PMD libs, such easy versioning
is my preferred choice, FWIW.

> 
> This role is currently held by the third digit but since there's a confusion
> with DPDK revisions, it won't be used internally by the PMD. Hopefully this
> fourth digit will remain unused (otherwise I can add as many digits as
> necessary to make it acceptable, I'll then re-consider the SHA1 idea :)

hehe :-)

  Marcelo
  
Adrien Mazarguil Feb. 6, 2018, 11:06 a.m. UTC | #15
On Mon, Feb 05, 2018 at 03:06:19PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 04:54:55PM +0100, Adrien Mazarguil wrote:
> > On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > > > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > > > 05/02/2018 14:44, Adrien Mazarguil:
> ...
> > > > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > > > collision. Moreover the next time someone decides to update all version
> > > > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > > > 
> > > > > That's what I meant with stricter. Yes it would catch such
> > > > > situations, but you tell me on how much we want to protect/restrict
> > > > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > > > the glue library at a time?
> > > > 
> > > > No, they're always built together. We're only adding this versioning to
> > > > avoid issues when users somehow end up with several DPDK versions installed
> > > > on their system, or with leftovers of previous releases lying around. That's
> > > > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > > > the symbol version number check afterward is performed just in case.
> > > 
> > > Interesting. These leftovers probably wouldn't be there if it wasn't
> > > versioned in the first place. :-)
> > 
> > Seriously, we can't assume users will do everything using neat packages and
> > may run an unfortunate "make install" from the DPDK source tree without
> > noticing they wrecked their system. Someone will have to mop the ensuing but
> > preventable bug reports.
> > 
> > > > > > > Given the added complexity, is there really a problem with simple version
> > > > > > > numbers we increment every time something gets modified? (Note this is
> > > > > > > already how our .map files work, they're not generated automatically)
> > > > > > 
> > > > > > Our map files show the major version where a symbol was introduced.
> > > > > > It is simple because no symbol can be introduced in a minor version.
> > > > > > 
> > > > > > > How about keeping things as is?
> > > > > 
> > > > > I don't really see the need of unique filenames. The next patch is
> > > > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > > > enough for this, no?
> > > > 
> > > > As you said, "if" versioned. As an undocumented empty string by default,
> > > > there's no way to be sure. Leaving the PMD version its internal but
> > > > (unfortunately) exposed bits will certainly prevent mistakes.
> > > > 
> > > > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > > > If you don't want to correlate the .so version number with DPDK version
> > > > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > > > 
> > > > > +1
> > > > 
> > > > Then are you fine with the "18.02.0" suffix?
> > > 
> > > Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> > > to dpdk version.
> > > 
> > > With the latest replies, I don't think the reasoning is enough to
> > > justify these extra checks, but I won't oppose to including it.
> > 
> > 18.02.0 makes it tied to the current release number, so I guess we agree.
> 
> It makes them equal, but not tied. If nobody patches it, when 18.02.1
> is out, the glue lib will still be 18.02.0.

Well this must be understood as "this plug-in implements 18.02.0's mlx4 glue
ABI", which remains true (and compatible) with subsequent DPDK releases as
long as the glue code is not updated.

Note this is no different from a single-digit suffix, which wouldn't be
updated either if the ABI isn't. Again, these initial digits are needed
because otherwise there is already a confusion with stable branches that
implement different ABIs and are therefore incompatible:

 librte_pmd_mlx4_glue.so.17.02.1
 librte_pmd_mlx4_glue.so.17.11.1
 librte_pmd_mlx4_glue.so.18.02.0

With a single digit, all of them would be named "librte_pmd_mlx4_glue.so.1",
rendering versioning basically useless.

> > The idea for now is this part remains tied to the DPDK release.
> > 
> > If a new ABI version is needed in a subsequent commit, the initial part gets
> > bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
> > intermediate commits break the glue ABI, a fourth digit is added
> > (e.g. 42.02.0.1).
> 
> I'll defer this to other project developers. This is more about a
> project standard than anything here. I could even argue that this glue
> should be named after the pmd lib, such as
>    ./usr/local/lib/librte_pmd_mlx4_glue.so.1.1
> The fact of not providing the _glue.so symlink is enough to avoid
> others from linking against it. But it's more of a project standard
> than a technical decision, I guess, weather this lib is seen as a
> plugin or as a (private) library.

I think you nailed it, I call it a "plug-in" because dlopen() is manually
performed on it, however it's in fact a private library whose API is not
exposed and no application is supposed to use directly.

For this reason, while up to package maintainers, my suggestion is to not
install it in a public location like "/usr/local/lib" but configure
RTE_EAL_PMD_PATH to some DPDK-specific path, e.g. "/usr/share/dpdk/pmd",
which is possible since patch 4/4 of this series.

> Considering the versioning used for the PMD libs, such easy versioning
> is my preferred choice, FWIW.

Problem remains that the DPDK projects manages its own backports/stable
releases system instead of relying on package maintainers for that, so
properly versioning things from the beginning to avoid collisions is really
always a concern. Had backports not been a requirement in the first place,
I agree a single digit would have been enough.

My suggestion of using 18.02.0 (instead of 18.02.1) stands. It addresses
Thomas' concern by properly matching the DPDK release the ABI was last
updated for and mine for the backports issues mentioned above. Let's go with
that and move on.
  

Patch

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index c004ac71c..cc9db9977 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -33,7 +33,9 @@  include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx4.a
-LIB_GLUE = librte_pmd_mlx4_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
@@ -64,6 +66,7 @@  CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 CFLAGS += -DMLX4_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX4_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx4_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -131,6 +134,7 @@  $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx4_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx4
 
 mlx4_glue.o: mlx4_autoconf.h
@@ -139,6 +143,6 @@  endif
 
 clean_mlx4: FORCE
 	$Q rm -f -- mlx4_autoconf.h mlx4_autoconf.h.new
-	$Q rm -f -- mlx4_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx4_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx4
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 201d39b6e..61a852fb9 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -808,6 +808,11 @@  rte_mlx4_pmd_init(void)
 			assert(((const void *const *)mlx4_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx4_glue->version, MLX4_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx4_glue->version, MLX4_GLUE_VERSION);
+		return;
+	}
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 47ae7ad0f..3b79d320e 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -240,6 +240,7 @@  mlx4_glue_dv_set_context_attr(struct ibv_context *context,
 }
 
 const struct mlx4_glue *mlx4_glue = &(const struct mlx4_glue){
+	.version = MLX4_GLUE_VERSION,
 	.fork_init = mlx4_glue_fork_init,
 	.get_async_event = mlx4_glue_get_async_event,
 	.ack_async_event = mlx4_glue_ack_async_event,
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index de251c622..368f906bf 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -19,7 +19,13 @@ 
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX4_GLUE_VERSION
+#define MLX4_GLUE_VERSION ""
+#endif
+
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx4_glue {
+	const char *version;
 	int (*fork_init)(void);
 	int (*get_async_event)(struct ibv_context *context,
 			       struct ibv_async_event *event);
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4b20d718b..4086f2039 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -33,7 +33,9 @@  include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx5.a
-LIB_GLUE = librte_pmd_mlx5_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
@@ -74,6 +76,7 @@  CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -Wno-strict-prototypes
 ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
 CFLAGS += -DMLX5_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX5_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx5_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -180,6 +183,7 @@  $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx5_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx5
 
 mlx5_glue.o: mlx5_autoconf.h
@@ -188,6 +192,6 @@  endif
 
 clean_mlx5: FORCE
 	$Q rm -f -- mlx5_autoconf.h mlx5_autoconf.h.new
-	$Q rm -f -- mlx5_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx5_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx5
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 050cfac0d..341230d2b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1151,6 +1151,11 @@  rte_mlx5_pmd_init(void)
 			assert(((const void *const *)mlx5_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx5_glue->version, MLX5_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx5_glue->version, MLX5_GLUE_VERSION);
+		return;
+	}
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index 8f500be6e..1c4396ada 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -308,6 +308,7 @@  mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 }
 
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
+	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
 	.alloc_pd = mlx5_glue_alloc_pd,
 	.dealloc_pd = mlx5_glue_dealloc_pd,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 7fed302ba..b5efee3b6 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -19,6 +19,10 @@ 
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX5_GLUE_VERSION
+#define MLX5_GLUE_VERSION ""
+#endif
+
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_counter_set;
 struct ibv_counter_set_data;
@@ -27,7 +31,9 @@  struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
+	const char *version;
 	int (*fork_init)(void);
 	struct ibv_pd *(*alloc_pd)(struct ibv_context *context);
 	int (*dealloc_pd)(struct ibv_pd *pd);