[dpdk-dev] Why nothing since 1.8.0?

Neil Horman nhorman at tuxdriver.com
Fri Jan 16 17:51:19 CET 2015


On Thu, Jan 15, 2015 at 09:55:00PM +0000, O'driscoll, Tim wrote:
> > -----Original Message-----
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Thursday, January 15, 2015 6:51 PM
> > To: Thomas Monjalon
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] Why nothing since 1.8.0?
> > 
> > On Thu, Jan 15, 2015 at 06:25:33PM +0100, Thomas Monjalon wrote:
> > > 2015-01-15 08:06, Neil Horman:
> > > > On Thu, Jan 15, 2015 at 10:51:38AM +0100, Thomas Monjalon wrote:
> > > > > 2015-01-15 04:27, Ouyang, Changchun:
> > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhang, Helin
> > > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil
> > Horman
> > > > > > > > On Wed, Jan 14, 2015 at 12:23:52PM -0800, Stephen Hemminger
> > wrote:
> > > > > > > > > Ok, so 1.8.0 came out almost a month ago and none of the
> > patches
> > > > > > > > > that were deferred waiting for the release got merged since
> > then.
> > > > > > > > > Last commit in git is the 1.8.0 release.
> > > > > > > > >
> > > > > > > > > Where is the post-merge window bundle, where are the later
> > commits?
> > > > > > > > > Lots of patches are sitting rotting in patchwork...
> > > > > > > >
> > > > > > > > +1, I've had the same questions.
> > > > > > > > Neil
> > > > > > >
> > > > > > > +1, Some patch set might be ready for being merged.
> > > > > >
> > > > > > +1,  the earlier some patches are merged into mainline, and the easier
> > those
> > > > > > sequent patch sets can resolve their conflicts.
> > > > >
> > > > > +1, there are some patches which are properly reviewed
> > > > >
> > > > > Reminder: sub-tree to manage specific part of DPDK can be open on
> > request
> > > >
> > > > Ok, I think what you're saying here is you're too busy to handle all the
> > patches
> > > > comming in at the moment.  As such I'd like to propose a sub-tree
> > encompassing
> > > > all the pmds in DPDK.  I would envision that including all the acutal pmd's
> > in
> > > > the tree, as well as the infrastructure that is used to interface them to the
> > > > core (i.e. the ethdev/rte_ether library).  I'll gladly maintain the patch pool
> > > > and send you pull requests.
> > >
> > > The list of PMDs is increasing:
> > > 	librte_pmd_af_packet
> > > 	librte_pmd_bond
> > > 	librte_pmd_e1000
> > > 	librte_pmd_enic
> > > 	librte_pmd_i40e
> > > 	librte_pmd_ixgbe
> > > 	librte_pmd_pcap
> > > 	librte_pmd_ring
> > > 	librte_pmd_virtio
> > > 	librte_pmd_vmxnet3
> > > 	librte_pmd_xenvirt
> > > There is already some sub-trees for bnx2x, fm10k and i40e:
> > > 	http://dpdk.org/browse/
> > >
> > Yes, and I've mentioned before that that is an absolutely silly way to break
> > out
> > subtrees.  You have to find a balance of workload distribution and developer
> > convienience.
> > 
> > I also note that these are problematic because you're not merging anything
> > from them. Is it your intention to keep bnx2 and fm10k separate in
> > perpituity?
> > If so, thats a real problem, because then we effectively just have several out
> > of tree drivers, and thats just unacceptible.
> > 
> > > > If you could set me up with a login to dpdk.org, I'd appreciate it.
> > >
> > > It is preferred to have 1 sub-tree per module.
> > > What do you think of managing contributions for af_packet and/or virtio?
> > > It would make sense as virtio is a RedHat technology.
> > > Maybe it could include vhost lib and example.
> > >
> > No, for reasons I've mentioned before.  If you take each pmd/library and
> > create
> > a subtree for it, you've created the most fine grained control of subtrees you
> > could ask for, but you've created a nighmare of a burden on developers who
> > want
> > to update any code, especially if they have patches that hit multiple trees.
> > 
> > Look at some of the stats in the dpdk tree:
> > 
> > Library		Commits between 1.7.0 and 1.8.0
> > librte_acl		5
> > librte_cfgfile		0
> > librte_cmdline		4
> > librte_compat		0
> > librte_distributor 	5
> > librte_eal 		125
> > librte_ether 		31
> > librte_hash 		1
> > librte_ip_frag 		5
> > librte_ivshmem 		0
> > librte_kni 		2
> > librte_kvargs 		0
> > librte_lpm 		1
> > librte_malloc 		1
> > librte_mbuf 		39
> > librte_mempool 		4
> > librte_meter 		0
> > librte_net 		4
> > librte_pipeline 	0
> > librte_pmd_af_packet 	4
> > librte_pmd_bond 	20
> > librte_pmd_e1000 	21
> > librte_pmd_enic 	12
> > librte_pmd_i40e 	90
> > librte_pmd_ixgbe 	83
> > librte_pmd_pcap 	4
> > librte_pmd_ring 	0
> > librte_pmd_virtio 	21
> > librte_pmd_vmxnet3 	21
> > librte_pmd_xenvirt 	6
> > librte_port 		6
> > librte_power 		3
> > librte_ring 		2
> > librte_sched 		1
> > librte_table 		7
> > librte_timer 		0
> > librte_vhost 		30
> > 
> > If you look at all of the pmds in the dpdk tree, we're talking about ~300
> > patches per release.  If you look at the net-next tree for the linux kernel,
> > Dave Miller merged 569 patches on his own (based on the following
> > command:
> > git log --pretty=format:%H v3.17..v3.18 -- drivers/net/ethernet/ net/core/ |
> > wc
> > -l)
> > 
> > And that doesn't account for the ~500 patches that come in via pull request
> > from
> > the wireless subtree.  Nor does it account for the merge window for net-
> > next
> > being 2 months instead of dpdk's 6 months.  Theres no need in any way for
> > 12
> > maintainers to be twiddling their thumbs waiting on ~20 patches each, and
> > for
> > that split, you've forced developers to potentially develop patches against 12
> > trees (12 being the current number of PMD's that are in the dpdk).
> > 
> > The right answer here is balance.  Let me split out the pmd's and ethernet
> > infrastructure libraries to a subtree.  I'll pull in patches posted regarding
> > pmd's and librte_ether/ip_frag etc, and send you a pull requests after each
> > release so you get all the latest bits, and then pulls for stabilization on each
> > -rc. I can manage 300 patches without issue, and that takes a load off your
> > shoulders.  I'll get fm10k integrated, as well as bnx2.  That gives us a single
> > alternate tree for developers to go to for pmd and pmd infrastructure
> > updates.
> > Its a win-win.
> > 
> > Regards
> > Neil
> 
> I agree with Thomas on this. The approach we've been taking for PMDs for newer Intel NICs is to have a separate sub-repository with a maintainer who's an expert in the area. This offloads some work from Thomas, ensures that the maintainer is very familiar with the PMD code and the corresponding hardware, 
As I mentioned to you in a private note, you're convoluting two roles into a
single one here to the detriment of both:

1) A tree maintainer is a machine, who is responsible for the mechanicm of SCM
and tree maintenence.  They are responsible for merging patches, making sure
that merged patches make it to their upstream tree, ensuring conflicts get
resolved, and above all, making sure that SME's review patches before they get
merged

2) An SME (subject matter expert), is just that, someone who is intimately
familiar with the hardware and software of a certain bit of a project.  Their
only responsibility is to ensure that proposed changes are legitimate and safe.
They review patches within their purview and provide acks to the tree
maintainer.

When you merge these roles, you take the person most responsible for the
stability of their code, and distract them with 1000 patch management
operations, and from the work of developing new code for their area of
expertise.

The separation of these roles has evolved in several communities because of
exactly the above.  Linus knows alot about the kernel, but he's never opened the
I40e datasheet, and he doesn't have to because he trusts Dave M to ensure that
code in his pull requests is legitimate.  Dave in turn relies on someone at
Intel to ack every I40e patch on the list before he merges it.  He does exactly
the same thing with every single other sub-component as listed in the
MAINTAINERS file.  Thats how he is able to manage twice as many patches as the
dpdk can in half the amount of time.  If its efficiency you're after, thats the
route to go.  Please don't ignore all that evolutionary refinement.


>and doesn't involve too much additional work for the developers involved (as you said, there isn't a huge volume of commits for any individual PMD).
Patch volume isn't where the additional effort comes in.  Its in the
determination of what tree to develop against.

Lets take an example.  I'm doing work on the I40e card, so I have to look to see
if theres a separate tree for it.  I figure out there is, so I have to clone
that tree, and develop a patch against it.  So far so good.

A few days later I notice a bug in the pcap pmd, so I want to write a patch to
fix it.  If we carry your model out to the point where each pmd has its own
subtree, I have to go find that new tree, and clone it as well (or add a new
remote to my existing tree), pull those changes, branch from the proper master,
and continue my work).

If later I find a bug in the virtio pmd, I have to repeat the above process
again, cloning a new tree, or adding a new remote for that drivers tree.

It doesn't sound like alot when you're just sitting in a room talking about it,
but for the developers actually working on the code, its a significant
inconvienience, since it means that any pmd you want to touch requires you to go
through this lookup process, ensuring your branching from the right master
branch in the proper tree.

Conversely, if you put all pmds in a single subtree, it doesn't matter what
pmd a person is working on, they only have to clone the pmd subtree, and they're
good to go.


> We have this in place for i40e now, and will be applying this to fm10k, which hasn't been upstreamed yet but will be in time for the 2.0 release. Based on our experiences with those, we may well look at extending the model to include PMDs for other Intel NICs, and possibly other libraries, as well.
> 
You really haven't.  You have an i40e tree, but you have dozens of I40e patches
on the list, and all of them thus far have been integrated into the main tree.
Ditto with the bnx2x tree and others that have been separated.  Please remember
subject matter expert != repo maintainer.  The roles can be, and should be,
separeated.

> As you said, there's a balance to be struck, and too many subtrees may become unmanageable. With respect to your concern about developers having to potentially develop patches against multiple subtrees, this has never been raised as a concern by any of our development team. Is there any historical data on the number of changes that would fall into this category so we can see if it's a real problem or not?
Look at the linux kernel if you want historical data on where the balance is.
Major subsystems has become the natural breaking point for subtree maintenence.
Every net driver goes through Millers net or net-next tree.  All scsi updates go
through Bottomleys scsi tree.  FCoE changes used go through what used to be Rob
Loves FCoE tree.  It just makes sense.  I'm sure there isn't recorded data to
show anything more granular than that because no one ever seriously considered
breaking out all 301 network drivers into their own subtree, because its
obviously unmanageable at that scale. Even taken all as a single unit (as
the kernel does for network drivers), a single maintainer can still handle the
patch volume with the right subject matter experts doing timely reviews.


The point of this very long message is that maintainers != repos.  Just because
you're a subject matter expert in a given bit of hardware/pmd, in no way means
you need to be responsible for patch merging/maintenence.  Doing so just makes
everything harder.

Neil



More information about the dev mailing list