[dpdk-dev] Beyond DPDK 2.0

Marc Sune marc.sune at bisdn.de
Mon Apr 27 17:34:46 CEST 2015



On 27/04/15 15:39, Wiles, Keith wrote:
>
> On 4/27/15, 4:52 AM, "Marc Sune" <marc.sune at bisdn.de> wrote:
>
>>
>> On 27/04/15 03:41, Wiles, Keith wrote:
>>> On 4/26/15, 4:56 PM, "Neil Horman" <nhorman at tuxdriver.com> wrote:
>>>
>>>> On Sat, Apr 25, 2015 at 04:08:23PM +0000, Wiles, Keith wrote:
>>>>> On 4/25/15, 8:30 AM, "Marc Sune" <marc.sune at bisdn.de> wrote:
>>>>>
>>>>>> On 24/04/15 19:51, Matthew Hall wrote:
>>>>>>> On Fri, Apr 24, 2015 at 12:39:47PM -0500, Jay Rolette wrote:
>>>>>>>> I can tell you that if DPDK were GPL-based, my company wouldn't be
>>>>>>>> using
>>>>>>>> it. I suspect we wouldn't be the only ones...
>>>>>>>>
>>>>>>>> Jay
>>>>>>> I could second this, from the past employer where I used it. Right
>>>>> now
>>>>>>> I am
>>>>>>> using it in an open source app, I have a bit of GPL here and there
>>>>> but
>>>>>>> I'm
>>>>>>> trying to get rid of it or confine it to separate address spaces,
>>>>> where
>>>>>>> it
>>>>>>> won't impact the core code written around DPDK, as I don't want to
>>>>> cause
>>>>>>> headaches for any downstream users I attract someday.
>>>>>>>
>>>>>>> Hard-core GPL would not be possible for most. LGPL could be
>>>>>>> possible,
>>>>>>> but I
>>>>>>> don't think it could be worth the relicensing headache for that
>>>>>>> small
>>>>>>> change.
>>>>>>>
>>>>>>> Instead we should make the patch process as easy as humanly possible
>>>>> so
>>>>>>> people
>>>>>>> are encouraged to send us the fixes and not cart them around their
>>>>>>> companies
>>>>>>> constantly.
>>>>> +1 and besides the GPL or LGPL ship has sailed IMHO and we can not go
>>>>> back.
>>>> Actually, IANAL, but I think we can.  The BSD license allows us to fork
>>>> and
>>>> relicense the code I think, under GPL or any other license.  I'm not
>>>> advocating
>>>> for that mind you, just suggesting that its possible should it ever
>>>> become
>>>> needed.
>>>>
>>>>>> I agree. My feeling is that as the number of patches in the mailing
>>>>> list
>>>>>> grows, keeping track of them gets more and more complicated.
>>>>>> Patchwork
>>>>>> website was a way to try to address this issue. I think it was an
>>>>>> improvement, but to be honest, patchwork lacks a lot of
>>>>>> functionality,
>>>>>> such as properly tracking multiple versions of the patch (superseding
>>>>>> them automatically), and it lacks some filtering capabilities e.g.
>>>>>> per
>>>>>> user, per tag/label or library, automatically track if it has been
>>>>>> merged, give an overall status of the pending vs merged patches, set
>>>>>> milestones... Is there any alternative tool or improved version for
>>>>> that?
>>>>>
>>>> Agreed, this has come up before, off list unfortunately.  The volume of
>>>> patches
>>>> seems to be increasing at such a rate that a single maintainer has
>>>> difficulty
>>>> keeping up.  I proposed that the workload be split out to multiple
>>>> subtrees,
>>>> with prefixes being added to patch subjects on the list for local
>>>> filtering to
>>>> stem the tide.  Specifically I had proposed that the PMD's be split
>>>> into a
>>>> separate subtree, but that received pushback in favor of having each
>>>> library
>>>> having its own separate subtree, with a pilot program being made out of
>>>> the I40e
>>>> driver (which you might note sends pull requests to the list now).  I'd
>>>> still
>>>> like to see all PMD's come under a single subtree, but thats likely an
>>>> argument
>>>> for later.
>>>>
>>>> That said, Do you think that this patch latency is really a contributor
>>>> to low
>>>> project participation?  It definately a problem, but it seems to me
>>>> that
>>>> this
>>>> sort of issue would lead to people trying to parcitipate, then giving
>>>> up
>>>> (i.e.
>>>> we would see 1-2 emails from an individual, then not see them again).
>>>> I'd need
>>>> to look through the mailing list for such a pattern, but anecdotally
>>>> I've
>>>> not
>>>> seen that happen.  The problem you describe above is definately a
>>>> problem, but
>>>> its one for those individuals who are participating, not for those who
>>>> are
>>>> simply choosing not to.  And I think we need to address both.
>>>>
>>>>> I agree patchwork has some limitation, but I think the biggest issue
>>>>> is
>>>>> keeping up with the patches. Getting patches introduced into the main
>>>>> line
>>>>> is very slow. A patch submitted today may not get applied for weeks or
>>>>> months, then when another person submits a patch he is starting to
>>>>> run a
>>>>> very high risk of having to redo that patch, because a pervious patch
>>>>> makes his fail weeks/months later. I would love to see a better tool
>>>>> then
>>>>> patchwork, but the biggest issue is we have a huge backlog of patches.
>>>>> Personally I am not sure how Thomas or any is able to keep up with the
>>>>> patches.
>>>>>
>>>> This is absolutely a problem.  I'd like to think, more than a tool like
>>>> patchwork, a subtree organization to allow some modicum of parallel
>>>> review and
>>>> integration would really be a benefit here.
>>> Subtrees could work, but the real problem I think is the number of
>>> committers must be higher then one. Something like GitHub (and I assume
>>> Linux Foundation) have a method to add committers to a project. In the
>>> case of GitHub they just have to have a free GitHub account and they can
>>> become committers of the project buying the owner of the project enables
>>> them.
>>>
>>> On GitHub they have personal accounts and organization accounts I know
>>> only about the personal accounts, but they allow for 5 private repos and
>>> any number of public repos. The organization account has a lot of extra
>>> features that seem better for a DPDK community IMO and should be the one
>>> we use if we decide it is the right direction. We can always give it a
>>> shot for while and keep the dpdk.org and use dev at dpdk.org and its repo
>>> mirrored from GitHub as a transition phase. This way we can fall back to
>>> dpdk.org or move one to something else if we like.
>>>
>>> https://help.github.com/categories/organizations/
>>>
>>> The developers could still send patches via email list, but creating a
>>> repo and forking dpdk is easy, then send a pull request.
>> For the github "community" or free service, organization accounts just
>> allow you to set teams, where each time can be assigned to one or more
>> repositories. The differences are summarized here:
>>
>> https://help.github.com/articles/what-s-the-difference-between-user-and-or
>> ganization-accounts/
>>
>> And the permission schema, per team, is summarized here:
>>
>> https://help.github.com/articles/permission-levels-for-an-organization-rep
>> ository/
>>
>> Some limitations: i) only if the team has write permissions (IOW push
>> permissions) you can manage issues ii) there cannot be per-branch ACLs.
> I was assuming the organization GitHub is just to allow more then one
> admin/maintainers along with teams if needed. I would assume the repos are
> still public and others are allowed to fork or pull the repos. I think of
> the org version is just extra controls on top of a personal repo like
> design. The org/personal one should appear to the
> non-maintainers/admins/owner as a normal repo on GitHub, correct?

Right

>
> The GitHub organization is built for open-source and you can still have
> private repos, but then you start to have a cost depending on the number
> of private repos you want. If you do not have a lot of private repos then
> you should have no cost (I think). I do not see any reason for private
> repos, but I guest we could have some and we get 5 free and 10 is $25 per
> month.

I don't see the reason either, and I don't know why private repos would 
be useful here.

>>>
>>>>> The other problem I see is how patches are agreed on to be included in
>>>>> the
>>>>> mainline. Today it is just an ACK or a NAK on the mailing list. Then I
>>>>> see
>>>>> what I think to be only a few people ACKing or NAKing patches. This
>>>>> process has a lot of problems from a patch being ignore for some
>>>>> reason
>>>>> or
>>>>> someone having negative feed back on very minor detail or no way to
>>>>> push a
>>>>> patch forward a single NAK or comment.
>>>>>
>>>> So, this is an interesting issue in ideal meritocracies.  Currently
>>>> is/should be
>>>> looking for ACKs/NAK/s from the individuals listed in the MAINTAINER
>>>> files, and
>>>> those people should be the definitive subject matter experts on the
>>>> code
>>>> they
>>>> cover.  As such, I would agrue that they should be entitled to a
>>>> modicum
>>>> of
>>>> stylistic/trivial leeway.  That is to say, if they choose to block a
>>>> patch
>>>> around a very minor detail, then between them changing their position,
>>>> and the
>>>> patch author changing the code, the latter is likely the easier course
>>>> of
>>>> action, especially if the author can't make an argument for their
>>>> position.
>>>> That said, if such patch blockage becomes so egregious that individuals
>>>> stop
>>>> contributing, that needs to be known as well.  If you as a patch
>>>> author:
>>>>
>>>> 1) Have tried to submit patches
>>>> 2) Had them blocked for what you consider trivial reasons
>>>> 3) Plan to not contribute further because of this
>>>> 4) Still rely on the DPDK for your product
>>>>
>>>> Please, say something.  People in charge need to know when they're
>>>> pushing
>>>> contributors away.
>>>>
>>>> FWIW, I've tried to do some correlation between the git history and the
>>>> mailing
>>>> list.  I need to do more searches, but I have a feeling that early on,
>>>> the
>>>> majority of people who stopped contributing, did so because their
>>>> patches
>>>> weren't expressely blocked, but rather because they were simply
>>>> ignored.
>>>> No one
>>>> working on DPDK bothered to review those patches, and so they never got
>>>> merged.
>>>> Hopefully that problem has been addressed somewhat now.
>> I agree 100%
>>>>> I would like to see some type of layering process to allow patches to
>>>>> be
>>>>> applied in a timely manner a few weeks not months or completely
>>>>> ignored.
>>>>> Maybe some type of voting is reasonable, but we need to do something
>>>>> to
>>>>> turn around the patches in clean reasonable manner.
>>>>>
>>>>> Think we need some type of group meeting every week to look at the
>>>>> patches
>>>>> and determining which ones get applied, this gives quick feedback to
>>>>> the
>>>>> submitter as to the status of the patch.
>>>>>
>>>> I think a group meeting is going to be way too much overhead to manage
>>>> properly.
>>>> You'll get different people every week with agenda that may not line up
>>>> with
>>>> code quality, which is really what the review is meant to provide.  I
>>>> think
>>> I was only suggesting the maintainers attend the meeting. Of course they
>>> have to attend or have someone attend for them, just to get the voting
>>> done. If you do not attend then you do not get to vote or something like
>>> that is reasonable. Not that we should try and define the process here.
>>>
>>>> perhaps a better approach would be to require that that code owners
>>>> from
>>>> the
>>>> maintainer file provide and ACK/NAK on their patches within 3-4 days,
>>>> and
>>>> require a corresponding tree maintainer to apply the patch within 7 or
>>>> so.  That
>>>> would cap our patch latency.  Likewise, if a patch slips in creating a
>>>> regression, the author needs to be alerted and given a time window in
>>>> which to
>>>> fix the problem before the offending patch is reverted during the QE
>>>> cycle.
>>>>
>>>>
>>>>>> On the other side, since user questions, community discussions and
>>>>>> development happens in the same mailing list, things get really
>>>>>> complicated, specially for users seeking for help. Even though I
>>>>>> think
>>>>>> the average skills of the users of DPDK is generally higher than in
>>>>>> other software projects, if DPDK wants to attract more users, having
>>>>>> a
>>>>>> better user support is key, IMHO.
>>>>>>
>>>>>> So I would see with good eyes a separation between, at least,
>>>>>> dpdk-user
>>>>>> and dpdk-dev.
>>>> I wouldn't argue with this separation, seems like a reasonable
>>>> approach.
>>>>
>>>>> I do not remember seeing too many users on the list and making a list
>>>>> just
>>>>> for then is OK if everyone is fine with a list that has very few
>>>>> emails.
>>>>>> If the number of patches keeps growing, splitting the "dev" mailing
>>>>>> lists into different categories (eal and common, pmds, higher level
>>>>>> abstractions...) could be an option. However, this last point opens a
>>>>>> lot of questions on how to minimize interference between the
>>>>>> different
>>>>>> parts and API/ABI compatibility during the development.
>>>>> I believe if we just make sure we use tags in the subject line then we
>>>>> can
>>>>> have our email clients do the splitting of the emails instead of
>>>>> adding
>>>>> more emails lists.
>>>>>
>>>> Agreed
>> I think it is a good idea too. Maybe we can standardize some format e.g.
>> [TAG][PATCH vX], or something like that.
>>
>>>>>>> Perhaps it means having some ReviewBoard type of tools, a clone in
>>>>>>> Github or
>>>>>>> Bitbucket where the less hardcore kernel-workflow types could send
>>>>> back
>>>>>>> their
>>>>>>> small bug fixes a bit more easily, this kind of stuff. Google has
>>>>> been
>>>>>>> getting
>>>>>>> good uptake since they moved most of their open source across to
>>>>> Github,
>>>>>>> because the contribution workflow was more convenient than Google
>>>>> Code
>>>>>>> was.
>>>>> I like GitHub it is a much better designed tool then patchwork, plus
>>>>> it
>>>>> could get more eyes as it is very well know to the developer community
>>>>> in
>>>>> general. I feel GitHub has many advantages over the current systems in
>>>>> place but, it does not solve the all patch issues.
>>>>>
>>>> Github is actually a bit irritating for this sort of thing, as it
>>>> presumes a web
>>>> based interface for discussion.  They have some modicum of email
>>>> forwarding
>>>> enabled, but it has never quite worked right, or integrated properly.
>> An alternative to githubs and bitbuckets is a self-hosted forge, like
>> gitlab:
>>
>> https://about.gitlab.com/
>>
>> To be honest, I mostly work on open-source repositories, and in our
>> organization we use only gitlab for private repositories, so I haven't
>> played that much with it. But it seems to do its job and has almost all
>> of the features of the "community" github, if not more. I don't know if
>> you can even integrate it with github's accounts somehow, to prevent to
>> have to register.
>>
>> However, one of the important points of using github/bitbucket is
>> visibility and ease the contribution process. By using an self-hosted
>> solution, even if it is similar to github and well advertised in DPDK's
>> website, you kind of loose part of that advantage.
> I would suggest we use GitHub then picking yet another not as well know
> Git Repo system, if we decide to change.

I agree. I was just pointing out this as an option instead of 
github/bitbucket. Basically to (still) self-host the repository and tools.

>>> Email forwarding has seemed to work for me and in one case it took a bit
>>> to have GitHub stop sending me emails on a repo I did not want anymore
>>> :-)
>>>>> The only way we can get patch issues resolved is to put a bit more
>>>>> process
>>>>> in place.
>>>>>> Although I agree, we have to be careful on how github or bitbucket is
>>>>>> used. Having issues or even (e.g. github) pull requests *in addition*
>>>>> to
>>>>>> the normal contribution workflow can be a nightmare to deal with, in
>>>>>> terms of synchronization and preventing double work. So I guess
>>>>>> setting
>>>>>> up an official github or bitbucket mirror would be fine, via some
>>>>> simple
>>>>>> cronjob, but I guess it would end-up not using PRs or issues in
>>>>>> github
>>>>>> like the Linux kernel does.
>>>> 100% agree, we can't be split about this.  Allowing contributions from
>>>> n
>>>> channels just means most developers will only see/reviews 1/nth of the
>>>> patches
>>>> of interest to them.
>>> If we setup a GitHub or some other site, we would need to make Github
>>> the
>>> primary site to remove this type of problem IMO.
>> You mean changing the workflow from email based to issues and pull-req
>> or github pull req? Do you really think this is possible?
> Yes, I think pull-req is the standard GitHub method as everyone needs a
> repo anyway. If we can figure out how to integrate the email patches that
> would be great.

I think it is quite complicated. It needs to be completely seemless or 
it won't work, and we will have part of the discussions in the mailing 
list, and part in the pull-req issues.

I would think it the other way around => pull requests are "echoed" to 
the mailing list to be discussed there, and always CCed (how) to the 
issue to capture the discussion there too. Not trivial at all.

marc

>>>>>   From what I can tell GitHub seems to be a better solution for a free
>>>>> open
>>>>> environment. Bitbucket I have never used and GitHub seems more popular
>>>>> from one article I read.
>>>>>
>>>>>
>>>>>
>>>>> https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UT
>>>>> F-
>>>>> 8#
>>>>> q=bitbucket%20vs%20github
>>>>>
>>>>>
>>>>>> Btw, is this github organization already registered by Intel or some
>>>>>> other company of the community?
>>>>>>
>>>>>> https://github.com/dpdk
>>>>>>
>>> I was hoping someone would own up to the GitHub dpdk site.
>> Just wanted to know if this was the case. But, even if that would not be
>> the case, I *guess* that, as it happens with other services like
>> twitter, facebook..., Intel could claim the user, since it has the
>> registered trademark.
>>
>> marc
>>
>>>>>> Marc
>>>>> If we can used the above that would be great, but a name like
>>>>> Œdpdk-community¹ or something could work too.
>>>>>
>>>>> We can host the web site here and have many sub-projects like
>>>>> Pktgen-DPDK
>>>>> :-) under the same page. Not to say anything bad about our current web
>>>>> pages as I find it difficult to use sometimes and find things like
>>>>> patchwork link. Maintaining a web site is a full time job and GitHub
>>>>> does
>>>>> maintain the site, plus we can collaborate on host web page on the
>>>>> GitHub
>>>>> site easier.
>>>>>
>>>>> Moving to the Linux Foundation is an option as well as it is very well
>>>>> know and has some nice ways to get your project promoted. It does
>>>>> have a
>>>>> few drawbacks in process handling and cost to state a few. The process
>>>>> model is all ready defined, which is good and bad it just depends on
>>>>> your
>>>>> needs IMO.
>>>>>
>>>>> Regards,
>>>>> ++Keith
>>>>>
>>>>>>> Matthew.



More information about the dev mailing list