[dpdk-dev] Beyond DPDK 2.0

Wiles, Keith keith.wiles at intel.com
Mon Apr 27 15:39:59 CEST 2015



On 4/27/15, 4:52 AM, "Marc Sune" <marc.sune at bisdn.de> wrote:

>
>
>On 27/04/15 03:41, Wiles, Keith wrote:
>>
>> On 4/26/15, 4:56 PM, "Neil Horman" <nhorman at tuxdriver.com> wrote:
>>
>>> On Sat, Apr 25, 2015 at 04:08:23PM +0000, Wiles, Keith wrote:
>>>>
>>>> On 4/25/15, 8:30 AM, "Marc Sune" <marc.sune at bisdn.de> wrote:
>>>>
>>>>>
>>>>> On 24/04/15 19:51, Matthew Hall wrote:
>>>>>> On Fri, Apr 24, 2015 at 12:39:47PM -0500, Jay Rolette wrote:
>>>>>>> I can tell you that if DPDK were GPL-based, my company wouldn't be
>>>>>>> using
>>>>>>> it. I suspect we wouldn't be the only ones...
>>>>>>>
>>>>>>> Jay
>>>>>> I could second this, from the past employer where I used it. Right
>>>> now
>>>>>> I am
>>>>>> using it in an open source app, I have a bit of GPL here and there
>>>> but
>>>>>> I'm
>>>>>> trying to get rid of it or confine it to separate address spaces,
>>>> where
>>>>>> it
>>>>>> won't impact the core code written around DPDK, as I don't want to
>>>> cause
>>>>>> headaches for any downstream users I attract someday.
>>>>>>
>>>>>> Hard-core GPL would not be possible for most. LGPL could be
>>>>>>possible,
>>>>>> but I
>>>>>> don't think it could be worth the relicensing headache for that
>>>>>>small
>>>>>> change.
>>>>>>
>>>>>> Instead we should make the patch process as easy as humanly possible
>>>> so
>>>>>> people
>>>>>> are encouraged to send us the fixes and not cart them around their
>>>>>> companies
>>>>>> constantly.
>>>> +1 and besides the GPL or LGPL ship has sailed IMHO and we can not go
>>>> back.
>>> Actually, IANAL, but I think we can.  The BSD license allows us to fork
>>> and
>>> relicense the code I think, under GPL or any other license.  I'm not
>>> advocating
>>> for that mind you, just suggesting that its possible should it ever
>>>become
>>> needed.
>>>
>>>>> I agree. My feeling is that as the number of patches in the mailing
>>>> list
>>>>> grows, keeping track of them gets more and more complicated.
>>>>>Patchwork
>>>>> website was a way to try to address this issue. I think it was an
>>>>> improvement, but to be honest, patchwork lacks a lot of
>>>>>functionality,
>>>>> such as properly tracking multiple versions of the patch (superseding
>>>>> them automatically), and it lacks some filtering capabilities e.g.
>>>>>per
>>>>> user, per tag/label or library, automatically track if it has been
>>>>> merged, give an overall status of the pending vs merged patches, set
>>>>> milestones... Is there any alternative tool or improved version for
>>>> that?
>>>>
>>> Agreed, this has come up before, off list unfortunately.  The volume of
>>> patches
>>> seems to be increasing at such a rate that a single maintainer has
>>> difficulty
>>> keeping up.  I proposed that the workload be split out to multiple
>>> subtrees,
>>> with prefixes being added to patch subjects on the list for local
>>> filtering to
>>> stem the tide.  Specifically I had proposed that the PMD's be split
>>>into a
>>> separate subtree, but that received pushback in favor of having each
>>> library
>>> having its own separate subtree, with a pilot program being made out of
>>> the I40e
>>> driver (which you might note sends pull requests to the list now).  I'd
>>> still
>>> like to see all PMD's come under a single subtree, but thats likely an
>>> argument
>>> for later.
>>>
>>> That said, Do you think that this patch latency is really a contributor
>>> to low
>>> project participation?  It definately a problem, but it seems to me
>>>that
>>> this
>>> sort of issue would lead to people trying to parcitipate, then giving
>>>up
>>> (i.e.
>>> we would see 1-2 emails from an individual, then not see them again).
>>> I'd need
>>> to look through the mailing list for such a pattern, but anecdotally
>>>I've
>>> not
>>> seen that happen.  The problem you describe above is definately a
>>> problem, but
>>> its one for those individuals who are participating, not for those who
>>>are
>>> simply choosing not to.  And I think we need to address both.
>>>
>>>> I agree patchwork has some limitation, but I think the biggest issue
>>>>is
>>>> keeping up with the patches. Getting patches introduced into the main
>>>> line
>>>> is very slow. A patch submitted today may not get applied for weeks or
>>>> months, then when another person submits a patch he is starting to
>>>>run a
>>>> very high risk of having to redo that patch, because a pervious patch
>>>> makes his fail weeks/months later. I would love to see a better tool
>>>> then
>>>> patchwork, but the biggest issue is we have a huge backlog of patches.
>>>> Personally I am not sure how Thomas or any is able to keep up with the
>>>> patches.
>>>>
>>> This is absolutely a problem.  I'd like to think, more than a tool like
>>> patchwork, a subtree organization to allow some modicum of parallel
>>> review and
>>> integration would really be a benefit here.
>> Subtrees could work, but the real problem I think is the number of
>> committers must be higher then one. Something like GitHub (and I assume
>> Linux Foundation) have a method to add committers to a project. In the
>> case of GitHub they just have to have a free GitHub account and they can
>> become committers of the project buying the owner of the project enables
>> them.
>>
>> On GitHub they have personal accounts and organization accounts I know
>> only about the personal accounts, but they allow for 5 private repos and
>> any number of public repos. The organization account has a lot of extra
>> features that seem better for a DPDK community IMO and should be the one
>> we use if we decide it is the right direction. We can always give it a
>> shot for while and keep the dpdk.org and use dev at dpdk.org and its repo
>> mirrored from GitHub as a transition phase. This way we can fall back to
>> dpdk.org or move one to something else if we like.
>>
>> https://help.github.com/categories/organizations/
>>
>> The developers could still send patches via email list, but creating a
>> repo and forking dpdk is easy, then send a pull request.
>
>For the github "community" or free service, organization accounts just
>allow you to set teams, where each time can be assigned to one or more
>repositories. The differences are summarized here:
>
>https://help.github.com/articles/what-s-the-difference-between-user-and-or
>ganization-accounts/
>
>And the permission schema, per team, is summarized here:
>
>https://help.github.com/articles/permission-levels-for-an-organization-rep
>ository/
>
>Some limitations: i) only if the team has write permissions (IOW push
>permissions) you can manage issues ii) there cannot be per-branch ACLs.

I was assuming the organization GitHub is just to allow more then one
admin/maintainers along with teams if needed. I would assume the repos are
still public and others are allowed to fork or pull the repos. I think of
the org version is just extra controls on top of a personal repo like
design. The org/personal one should appear to the
non-maintainers/admins/owner as a normal repo on GitHub, correct?

The GitHub organization is built for open-source and you can still have
private repos, but then you start to have a cost depending on the number
of private repos you want. If you do not have a lot of private repos then
you should have no cost (I think). I do not see any reason for private
repos, but I guest we could have some and we get 5 free and 10 is $25 per
month.
>
>>
>>
>>>> The other problem I see is how patches are agreed on to be included in
>>>> the
>>>> mainline. Today it is just an ACK or a NAK on the mailing list. Then I
>>>> see
>>>> what I think to be only a few people ACKing or NAKing patches. This
>>>> process has a lot of problems from a patch being ignore for some
>>>>reason
>>>> or
>>>> someone having negative feed back on very minor detail or no way to
>>>> push a
>>>> patch forward a single NAK or comment.
>>>>
>>> So, this is an interesting issue in ideal meritocracies.  Currently
>>> is/should be
>>> looking for ACKs/NAK/s from the individuals listed in the MAINTAINER
>>> files, and
>>> those people should be the definitive subject matter experts on the
>>>code
>>> they
>>> cover.  As such, I would agrue that they should be entitled to a
>>>modicum
>>> of
>>> stylistic/trivial leeway.  That is to say, if they choose to block a
>>>patch
>>> around a very minor detail, then between them changing their position,
>>> and the
>>> patch author changing the code, the latter is likely the easier course
>>>of
>>> action, especially if the author can't make an argument for their
>>> position.
>>> That said, if such patch blockage becomes so egregious that individuals
>>> stop
>>> contributing, that needs to be known as well.  If you as a patch
>>>author:
>>>
>>> 1) Have tried to submit patches
>>> 2) Had them blocked for what you consider trivial reasons
>>> 3) Plan to not contribute further because of this
>>> 4) Still rely on the DPDK for your product
>>>
>>> Please, say something.  People in charge need to know when they're
>>>pushing
>>> contributors away.
>>>
>>> FWIW, I've tried to do some correlation between the git history and the
>>> mailing
>>> list.  I need to do more searches, but I have a feeling that early on,
>>>the
>>> majority of people who stopped contributing, did so because their
>>>patches
>>> weren't expressely blocked, but rather because they were simply
>>>ignored.
>>> No one
>>> working on DPDK bothered to review those patches, and so they never got
>>> merged.
>>> Hopefully that problem has been addressed somewhat now.
>I agree 100%
>>>
>>>> I would like to see some type of layering process to allow patches to
>>>>be
>>>> applied in a timely manner a few weeks not months or completely
>>>>ignored.
>>>> Maybe some type of voting is reasonable, but we need to do something
>>>>to
>>>> turn around the patches in clean reasonable manner.
>>>>
>>>> Think we need some type of group meeting every week to look at the
>>>> patches
>>>> and determining which ones get applied, this gives quick feedback to
>>>>the
>>>> submitter as to the status of the patch.
>>>>
>>> I think a group meeting is going to be way too much overhead to manage
>>> properly.
>>> You'll get different people every week with agenda that may not line up
>>> with
>>> code quality, which is really what the review is meant to provide.  I
>>> think
>> I was only suggesting the maintainers attend the meeting. Of course they
>> have to attend or have someone attend for them, just to get the voting
>> done. If you do not attend then you do not get to vote or something like
>> that is reasonable. Not that we should try and define the process here.
>>
>>> perhaps a better approach would be to require that that code owners
>>>from
>>> the
>>> maintainer file provide and ACK/NAK on their patches within 3-4 days,
>>>and
>>> require a corresponding tree maintainer to apply the patch within 7 or
>>> so.  That
>>> would cap our patch latency.  Likewise, if a patch slips in creating a
>>> regression, the author needs to be alerted and given a time window in
>>> which to
>>> fix the problem before the offending patch is reverted during the QE
>>> cycle.
>>>
>>>
>>>>> On the other side, since user questions, community discussions and
>>>>> development happens in the same mailing list, things get really
>>>>> complicated, specially for users seeking for help. Even though I
>>>>>think
>>>>> the average skills of the users of DPDK is generally higher than in
>>>>> other software projects, if DPDK wants to attract more users, having
>>>>>a
>>>>> better user support is key, IMHO.
>>>>>
>>>>> So I would see with good eyes a separation between, at least,
>>>>>dpdk-user
>>>>> and dpdk-dev.
>>> I wouldn't argue with this separation, seems like a reasonable
>>>approach.
>>>
>>>> I do not remember seeing too many users on the list and making a list
>>>> just
>>>> for then is OK if everyone is fine with a list that has very few
>>>>emails.
>>>>> If the number of patches keeps growing, splitting the "dev" mailing
>>>>> lists into different categories (eal and common, pmds, higher level
>>>>> abstractions...) could be an option. However, this last point opens a
>>>>> lot of questions on how to minimize interference between the
>>>>>different
>>>>> parts and API/ABI compatibility during the development.
>>>> I believe if we just make sure we use tags in the subject line then we
>>>> can
>>>> have our email clients do the splitting of the emails instead of
>>>>adding
>>>> more emails lists.
>>>>
>>> Agreed
>
>I think it is a good idea too. Maybe we can standardize some format e.g.
>[TAG][PATCH vX], or something like that.
>
>>>
>>>>>> Perhaps it means having some ReviewBoard type of tools, a clone in
>>>>>> Github or
>>>>>> Bitbucket where the less hardcore kernel-workflow types could send
>>>> back
>>>>>> their
>>>>>> small bug fixes a bit more easily, this kind of stuff. Google has
>>>> been
>>>>>> getting
>>>>>> good uptake since they moved most of their open source across to
>>>> Github,
>>>>>> because the contribution workflow was more convenient than Google
>>>> Code
>>>>>> was.
>>>> I like GitHub it is a much better designed tool then patchwork, plus
>>>>it
>>>> could get more eyes as it is very well know to the developer community
>>>> in
>>>> general. I feel GitHub has many advantages over the current systems in
>>>> place but, it does not solve the all patch issues.
>>>>
>>> Github is actually a bit irritating for this sort of thing, as it
>>> presumes a web
>>> based interface for discussion.  They have some modicum of email
>>> forwarding
>>> enabled, but it has never quite worked right, or integrated properly.
>
>An alternative to githubs and bitbuckets is a self-hosted forge, like
>gitlab:
>
>https://about.gitlab.com/
>
>To be honest, I mostly work on open-source repositories, and in our
>organization we use only gitlab for private repositories, so I haven't
>played that much with it. But it seems to do its job and has almost all
>of the features of the "community" github, if not more. I don't know if
>you can even integrate it with github's accounts somehow, to prevent to
>have to register.
>
>However, one of the important points of using github/bitbucket is
>visibility and ease the contribution process. By using an self-hosted
>solution, even if it is similar to github and well advertised in DPDK's
>website, you kind of loose part of that advantage.

I would suggest we use GitHub then picking yet another not as well know
Git Repo system, if we decide to change.
>
>> Email forwarding has seemed to work for me and in one case it took a bit
>> to have GitHub stop sending me emails on a repo I did not want anymore
>>:-)
>>>> The only way we can get patch issues resolved is to put a bit more
>>>> process
>>>> in place.
>>>>> Although I agree, we have to be careful on how github or bitbucket is
>>>>> used. Having issues or even (e.g. github) pull requests *in addition*
>>>> to
>>>>> the normal contribution workflow can be a nightmare to deal with, in
>>>>> terms of synchronization and preventing double work. So I guess
>>>>>setting
>>>>> up an official github or bitbucket mirror would be fine, via some
>>>> simple
>>>>> cronjob, but I guess it would end-up not using PRs or issues in
>>>>>github
>>>>> like the Linux kernel does.
>>> 100% agree, we can't be split about this.  Allowing contributions from
>>>n
>>> channels just means most developers will only see/reviews 1/nth of the
>>> patches
>>> of interest to them.
>> If we setup a GitHub or some other site, we would need to make Github
>>the
>> primary site to remove this type of problem IMO.
>
>You mean changing the workflow from email based to issues and pull-req
>or github pull req? Do you really think this is possible?

Yes, I think pull-req is the standard GitHub method as everyone needs a
repo anyway. If we can figure out how to integrate the email patches that
would be great.
>
>>>>  From what I can tell GitHub seems to be a better solution for a free
>>>> open
>>>> environment. Bitbucket I have never used and GitHub seems more popular
>>>> from one article I read.
>>>>
>>>>
>>>> 
>>>>https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UT
>>>>F-
>>>> 8#
>>>> q=bitbucket%20vs%20github
>>>>
>>>>
>>>>> Btw, is this github organization already registered by Intel or some
>>>>> other company of the community?
>>>>>
>>>>> https://github.com/dpdk
>>>>>
>> I was hoping someone would own up to the GitHub dpdk site.
>
>Just wanted to know if this was the case. But, even if that would not be
>the case, I *guess* that, as it happens with other services like
>twitter, facebook..., Intel could claim the user, since it has the
>registered trademark.
>
>marc
>
>>
>>>>> Marc
>>>> If we can used the above that would be great, but a name like
>>>> Œdpdk-community¹ or something could work too.
>>>>
>>>> We can host the web site here and have many sub-projects like
>>>> Pktgen-DPDK
>>>> :-) under the same page. Not to say anything bad about our current web
>>>> pages as I find it difficult to use sometimes and find things like
>>>> patchwork link. Maintaining a web site is a full time job and GitHub
>>>> does
>>>> maintain the site, plus we can collaborate on host web page on the
>>>> GitHub
>>>> site easier.
>>>>
>>>> Moving to the Linux Foundation is an option as well as it is very well
>>>> know and has some nice ways to get your project promoted. It does
>>>>have a
>>>> few drawbacks in process handling and cost to state a few. The process
>>>> model is all ready defined, which is good and bad it just depends on
>>>> your
>>>> needs IMO.
>>>>
>>>> Regards,
>>>> ++Keith
>>>>
>>>>>> Matthew.
>>>>
>



More information about the dev mailing list