[dpdk-dev] versioning and maintenance

Neil Horman nhorman at tuxdriver.com
Fri Nov 21 21:17:20 CET 2014
Previous message: [dpdk-dev] versioning and maintenance
Next message: [dpdk-dev] [PATCH 0/3] Add RTE_ prefix to CACHE_LINE related macros
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Nov 21, 2014 at 02:23:15PM +0100, Thomas Monjalon wrote:
> 2014-11-20 20:05, Neil Horman:
> > On Thu, Nov 20, 2014 at 10:08:25PM +0100, Thomas Monjalon wrote:
> > > 2014-11-20 13:25, Neil Horman:
> > > > On Thu, Nov 20, 2014 at 06:09:10PM +0100, Thomas Monjalon wrote:
> > > > > 2014-11-19 10:13, Neil Horman:
> > > > > > On Wed, Nov 19, 2014 at 11:35:08AM +0000, Bruce Richardson wrote:
> > > > > > > On Wed, Nov 19, 2014 at 12:22:14PM +0100, Thomas Monjalon wrote:
> > > > > > > > Following the discussion we had with Neil during the conference call,
> > > > > > > > I suggest this plan, starting with the next release (2.0):
> > > > > > > > 	- add version numbers to libraries
> > > > > > > > 	- add version numbers to functions inside .map files
> > > > > > > > 	- create a git tree dedicated to maintenance and API compatibility
> > > > > > > > 
> > > > > > > > It means these version numbers must be incremented when breaking the API.
> > > > > > > > Though the old code paths will be maintained and tested separately by volunteers.
> > > > > > > > A mailing list for maintenance purpose could be created if needed.
> > > > > > > >
> > > > > > > Hi Thomas,
> > > > > > > 
> > > > > > > I really think that the versionning is best handled inside the main repository
> > > > > > > itself. Given that the proposed deprecation policy is over two releases i.e. an
> > > > > > > API is marked deprecated in release X and then removed in X+1, I don't see the
> > > > > > > maintaining of old code paths to be particularly onerous.
> > > > > > > 
> > > > > > > /Bruce
> > > > > > 
> > > > > > I agree with Bruce, even if it is on occasion an added workload, its not the
> > > > > > sort of thing that can or should be placed on an alternate developer.  Backwards
> > > > > > compatibility is the sort of thing that has to be on the mind of the developer
> > > > > > when modifying an API, and on the mind of the reviewer when reviewing code.  To
> > > > > > shunt that responsibility elsewhere invites the opportunity for backwards
> > > > > > compatibilty to be a second class citizen who's goal will never be reached,
> > > > > > because developers instituting ABI changes will never care about the
> > > > > > consequences, and anyone worrying about backwards compatibility will always be
> > > > > > playing catch up, possibly allowing ABI breaks to slip through.
> > > > > > 
> > > > > > Neil
> > > > >  
> > > > > Before taking a decision, we should detail every concern.
> > > > > 
> > > > > 1/
> > > > > Currently there are not a lot of API refactoring because DPDK is well tailored
> > > > > for x86 and Intel NICs. But we are seeing that new CPU and new NICs to support
> > > > > would require some adaptations.
> > > > > 
> > > > Yes, you're absolutely right here.  I had hoped that, during my presentation
> > > > that this would happen occasionaly, and that we would need to deal with it.
> > > > What I think you are implying here (correct me if I'm wrong), is that you would
> > > > advocate that we wait to introduce ABI versioning until after such refactoring
> > > > is, for lack of a better term "complete".  The problem here is that, software
> > > > that is growing in user base is never "complete".  What you are effectively
> > > > saying is that you want to wait until the API is in a state in which no (or
> > > > almost no) more changes are required, then fixate it.  Thats quite simply never
> > > > going to happen.  And if it does, it obviates the need for versioning at all.
> > > 
> > > I agree Neil. This point is not about how long we should wait but how the
> > > overhead could be estimate for coming releases.
> > > 
> > Well, I understand the desire, but I'm not sure how it can be accomplished.  For
> > a given release, the overhead will be dependent on two factors:
> > 
> > 1) The number off ABI changes in a given release
> > 
> > 2) The extent of the ABI changes that were made.
> > 
> > If we have a way to predict those, then we can estimate the overhead, but
> > without that information, you're kinda stuck.  That said, if we all concur that
> > this is a necessecary effort to undertake, then the overhead is, not overly
> > important.  Whats more important is providing enough time to alot enough time to
> > do the work for a given project.  That is to say, when undertaking a large
> > refactoring, or other project that promises to make significant ABI changes,
> > that the developer needs to factor in time to design an implement backwards
> > compatibility.  Put another way, if the developer does their job right, and
> > takes backwards compatibility seriously, the overhead to you as a maintainer is
> > nil.  The onus to handle this extra effort needs to be on the developer.
> > 
> > > > > 2/
> > > > > I'm curious to know how you would handle a big change like the recent mbuf rework.
> > > > > Should we duplicate the structure and all the functions using mbuf?
> > > > 
> > > > Several ways, what you suggest above is one way, although thats what I would
> > > > consider to be a pessimal case.  Ideally such large changes are extreemely rare
> > > > (a search of the git history I think confirms this).  Much more common are
> > > > small, limited changes to various API's for which providing multiple versions of
> > > > a function is a much more reasonable approach.
> > > > 
> > > > In the event that we do decide to do a refactor that is so far reaching that we
> > > > simply don't feel like multi-versioning is feasible, the recourse is then to
> > > > deprecate the old API, publish that information on the deprecation schedule,
> > > > wait for a release, then replace it wholesale.  When the API is released, we
> > > > bump the DSO version number.  Note the versioning policy never guarantees that
> > > > backwards compatibility will always be available, nor does it stipulate that a
> > > > newer version of the API is available prior to removing the old one. The goal
> > > > here is to give distributors and application vendors advanced notice of ABI
> > > > breaking changes so that they can adapt appropriately before they are caught off
> > > > guard.  If the new ABI can't be packaged alongside the old, then so be it,
> > > > downstream vendors will have to use the upstream git head to test and validate,
> > > > rather than a newer distribution release
> > > 
> > > Seems reasonable.
> > > 
> > > > Ideally though, that shouldn't happen, because it causes downstream headaches,
> > > > and we would really like to avoid that.  Thats why I feel its so important to
> > > > keep this work in the main tree.  If we segregate it to a separate location it
> > > > will make it all to easy for developers to ignore these needs and just assume we
> > > > constantly drop old ABI versions without providing backwards compatibility.
> > > > 
> > > > > 3/
> > > > > Should we add new fields at the end of its structure to avoid ABI breaking?
> > > > > 
> > > > In the common case yes, this usually avoids ABI breakage, though it can't always
> > > > be relied upon (e.g. cases where structures are statically allocated by an
> > > > application).  And then there are patches that attempt to reduce memory usage
> > > > and increase performance by re-arranging structures.  In those cases we need to
> > > > do ABI versioning or announce/delay/release as noted above, though again, that
> > > > should really be avoided if possible.
> > > 
> > > So there is no hope of having fields logically sorted.
> > > Not a major problem but we have to know it. And it should probably be
> > > documented if we choose this way.
> > > 
> > Sure, though I'm not sure I agree with the statement above.  Having fields
> > logically sorted seems like it should be a forgone  conclusion in that the
> > developer should have laid those fields out in some semblance of order in the
> > first place.  If a large data structure re-ordering is taking place such that
> > structure fields are getting rearranged, that in my mind is part of a large
> > refactoring for which the entire API that is affected by those data structures
> > must have a new version created to provide backward compatibility, or in the
> > extreeme case, we may need to preform a warn and deprecate/exchange operation as
> > noted previously, though again, that is a Nuclear option.
> 
> Just to illustrate my thought:
> Let's imagine this struct {
> 	fish_name;
> 	fish_taste;
> 	vegetables_name;
> }
> When adding the new field "fish_cooking", we'll add it at the end to avoid ABI break.
> struct {
> 	fish_name;
> 	fish_taste;
> 	vegetables_name;
> 	fish_cooking;
> }
You're right, asthetically the above is displeasing, which is unfortunate.  The
decision we have to make is twofold:
1) Are asthetics more important that backwards compatibility (theres also a
correlary here, which is "are the performance/space gains acheived by
reorganizing the above structure sufficent to warant an ABI breakage)
and 
2) How can we mitigate the ABI impact if the answer to the above is "yes"

I think the answer to (1) is going to be situationally specific.  Generally,
Asthetics aren't worth breaking ABI, but performance gains might be, if they're
large enough.  We'll have to wait for an example case to see where exactly we
draw that line

Assuming the answer to (2) is yes, then we need to know how to mitigate it.
Ideally what we would do is create a secondary structure like so:

struct1 {
	fish_name;
	fish_taste;
	veg_name;
};

struct2 {
	fish_name;
	fish_taste;
	fish_cook;
	veg_name;
};

and then we would version the API calls that use struct1 to instead accept
struct2. So if a function used struct1 we would have:

static void foo_v1(struct1 *a) {
	struct2 b;
	b.fish_name = a->name;
	b.fish_taste = a->fish_taste;
	b.veg_name = a->veg_name;
	a-fish_cook = SOME_VALUE;
	foo_v2(&b);
}
VERSION_SYMBOL(foo, v1, 1.8);

static void foo_v2(struct2 *a) {
	...
	if (a->fish_cook == SOME_VALUE)
		/* we know this is from the old api version */
	...
}
VERSION_SYMBOL(foo, v2, 1.9);

That would be repeated of course for every function that used struct1.  Then we
can decide on a deprecation schedule, which might be the next release after this
change is published.  Or it might be longer, if the decision is that this change
is easy enough to maintain.

> It's mostly an esthetic/readability consequence.
> Now I'm hungry ;)
> 
I don't think I ever told you, my family and I were in Paris last summer, and
had lunch at this place just south of Notre Dame.  My wife still occasionally
mentions that as the best fish she ever had.  And my 7 year old daughter is
willing to forgo a trip to Disney World to go back :)


> > > > > 4/
> > > > > Developers contribute because they need some changes. So when breaking
> > > > > an API, their application is already ready for the new version.
> > > > > I mean the author of such patch is probably not really motivated to keep ABI
> > > > > compability and duplicate the code path.
> > > > > 
> > > > What?  That doesn't make any sense.  Its our job to enforce this requirement on
> > > > developers during the review cycle.  If you don't feel like we can enforce
> > > > coding requirements on the project, we've already lost.  I agree that an
> > > > application developer submitting a patch for DPDK might not care about ABI
> > > > compatibility because they've already modified their application, but they (and
> > > > we) need to recognize that there are more than just a handful of users of the
> > > > DPDK, some of whom don't participate in this community (i.e. are simply end
> > > > users).  We need to make sure all users needs are met.  Thats the entire point
> > > > of this patch series, to make DPDK available to a wider range of users.
> > > 
> > > Exact. To make it simple, you care about end users and I have to care about
> > > developers motivation. But I perfectly understand the end users needs.
> > > I don't say we cannot enforce coding requirements. I just think it will be
> > > less pleasant.
> > > 
> > I disagree with the assertion that you will loose developers becausee they don't
> > care about compatibility.  You're developer base may change.  This is no
> > different than any other requirement that you place on a developer.  You make
> > all sorts of mandates regarding development (they can't break other older
> > supported cpu architecture, their code has to compile on all configurations,
> > etc).  This is no different.
> > 
> > > > > 5/
> > > > > Intead of simply modifying an API function, it would appear as a whole new
> > > > > function with some differences compared to the old one. Such change is really
> > > > > not convenient to review.
> > > > 
> > > > Um, yes, versioning is the process of creating an additional
> > > > function that closely resembles an older version of the same function, but with
> > > > different arguments and a newer version number.  Thats what it is by defintion,
> > > > and yes, its additional work.  All you're saying here is that, its extra work
> > > > and we shouldn't do it.  I thought I made this clear on the call, its been done
> > > > in thousands of other libraries, but if you just don't want to do it, then you
> > > > should abandon distributions as a way to reach a larger community, but if you
> > > > want to see the DPDK reach a larger community, then this is something that has
> > > > to happen, hard or not.
> > > 
> > > The goal of this discussion is to establish all the implications of this
> > > decision. We expose the facts. No conclusion.
> > > 
> > You haven't exposed a fact, you've asserted an opinion.  Theres is no notion of
> > something being convienient or inconvienient to review in any quantitative way.
> > If facts are your goal, you missed the mark here.
> 
> Maybe you use a tool that I don't know.
> My main material for review is the patch. And I think it's simpler to check an
> one-line change than a duplicated code path. But instead of giving my opinion,
> I must expose what it looks like for a simple example:
> 
> -	void cook_fish()
> +	void cook_fish(oil_bottle)
> 	{
> +		use_oil(oil_bottle);
> 		start_fire();
> 		put_fish();
> 		wait();
> 		stop_fire();
> 	}
> 
> vs
> 
> -	void cook_fish()
> +	void __vsym cook_fish_v1()
> 	{
> 		start_fire();
> 		put_fish();
> 		wait();
> 		stop_fire();
> 	}
> +	VERSION_SYMBOL(cook_fish, _v1, 1);
> +	void cook_fish(oil_bottle)
> +	{
> +		use_oil(oil_bottle);
> +		start_fire();
> +		put_fish();
> +		wait();
> +		stop_fire();
> +	}
> +	BIND_DEFAULT_SYMBOL(cook_fish, 2);
> 

You make a fair point in that, generally speaking less code is easier to review
than more code, but that said, you need to normalize the comparison.  That is to
say, in the example above your first patch adds functionality (i.e. you add a
feature by which you use oil to cook a fish), in the second patch, you not only
add that feature, but allow older code that was already built to continue to
work.  You've not just added complexity, you've added features as well.  Its
like comparing a patch that adds features a and b, and indicating that you
would only accept feature a because you didn't want to review 2 features.  I
think thats important to remember, we're not adding code for the sake of making
our lives more difficult.  We're doing it to fullfill a need that a large group
of potential end users have.
>
Previous message: [dpdk-dev] versioning and maintenance
Next message: [dpdk-dev] [PATCH 0/3] Add RTE_ prefix to CACHE_LINE related macros
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list