[v9,1/1] net/af_xdp: introduce AF XDP PMD driver

Message ID 20190402154653.711-2-xiaolong.ye@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series Introduce AF_XDP PMD |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing success Performance Testing PASS

Commit Message

Xiaolong Ye April 2, 2019, 3:46 p.m. UTC
  Add a new PMD driver for AF_XDP which is a proposed faster version of
AF_PACKET interface in Linux. More info about AF_XDP, please refer to [1]
[2].

This is the vanilla version PMD which just uses a raw buffer registered as
the umem.

[1] https://fosdem.org/2018/schedule/event/af_xdp/
[2] https://lwn.net/Articles/745934/

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
---
 MAINTAINERS                                   |   7 +
 config/common_base                            |   5 +
 doc/guides/nics/af_xdp.rst                    |  48 +
 doc/guides/nics/features/af_xdp.ini           |  11 +
 doc/guides/nics/index.rst                     |   1 +
 doc/guides/rel_notes/release_19_05.rst        |   7 +
 drivers/net/Makefile                          |   1 +
 drivers/net/af_xdp/Makefile                   |  32 +
 drivers/net/af_xdp/meson.build                |  21 +
 drivers/net/af_xdp/rte_eth_af_xdp.c           | 956 ++++++++++++++++++
 drivers/net/af_xdp/rte_pmd_af_xdp_version.map |   3 +
 drivers/net/meson.build                       |   1 +
 mk/rte.app.mk                                 |   1 +
 13 files changed, 1094 insertions(+)
 create mode 100644 doc/guides/nics/af_xdp.rst
 create mode 100644 doc/guides/nics/features/af_xdp.ini
 create mode 100644 drivers/net/af_xdp/Makefile
 create mode 100644 drivers/net/af_xdp/meson.build
 create mode 100644 drivers/net/af_xdp/rte_eth_af_xdp.c
 create mode 100644 drivers/net/af_xdp/rte_pmd_af_xdp_version.map
  

Comments

Stephen Hemminger April 2, 2019, 6:56 p.m. UTC | #1
On Tue,  2 Apr 2019 23:46:53 +0800
Xiaolong Ye <xiaolong.ye@intel.com> wrote:

> +		/* pull from complete qeueu to leave more space */

Overall looks good, one last spelling error
  
Luca Boccassi April 2, 2019, 7:19 p.m. UTC | #2
On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
> diff --git a/drivers/net/af_xdp/Makefile
> b/drivers/net/af_xdp/Makefile
> new file mode 100644
> index 000000000..8343e3016
> --- /dev/null
> +++ b/drivers/net/af_xdp/Makefile
> @@ -0,0 +1,32 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2019 Intel Corporation
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_af_xdp.a
> +
> +EXPORT_MAP := rte_pmd_af_xdp_version.map
> +
> +LIBABIVER := 1
> +
> +CFLAGS += -O3
> +
> +# require kernel version >= v5.1-rc1
> +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
> +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf

Sorry for not noticing this before, but doesn't this require the full
kernel tree rather than just the typical headers package? Requiring the
full kernel tree to be available at build time will make this
unbuildable on distros that still use makefiles, like RHEL and SUSE. At
least on Debian and Ubuntu, the kernel headers packages distributed do
not include the full kernel tree, only the headers, so there's no
tools/lib or tools/include.

Like other dependencies, this should assume they are installed as
regular libraries, eg:

CFLAGS += $(shell command -v pkg-config > /dev/null 2>&1 && pkg-config --cflags libbpf || echo "-I/usr/include/bpf")

> +CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
> +LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
> +LDLIBS += -lrte_bus_vdev
> +LDLIBS += -lbpf

LDLIBS += $(shell command -v pkg-config > /dev/null 2>&1 && pkg-config --libs libbpf || echo "-lbpf")
  
Ferruh Yigit April 2, 2019, 7:43 p.m. UTC | #3
On 4/2/2019 4:46 PM, Xiaolong Ye wrote:
> Add a new PMD driver for AF_XDP which is a proposed faster version of
> AF_PACKET interface in Linux. More info about AF_XDP, please refer to [1]
> [2].
> 
> This is the vanilla version PMD which just uses a raw buffer registered as
> the umem.
> 
> [1] https://fosdem.org/2018/schedule/event/af_xdp/
> [2] https://lwn.net/Articles/745934/
> 
> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>

<...>

> @@ -0,0 +1,956 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2019 Intel Corporation.
> + */

> +#include <bpf/bpf.h>
> +#include <xsk.h>

Under linux, both headers are in same 'bpf' folder, why one included as
'bpf/bpf.h' but other 'xsk.h'?

Perhaps this is not problem when headers are installed into system folders, but
I am compiling using RTE_KERNELDIR, which used in Makefile as:
 -I$(RTE_KERNELDIR)/tools/lib/bpf

This fails to find 'bpf/bpf.h'

Also for '-lbpf', shouldn't need to add '-L$(RTE_KERNELDIR)/tools/lib/bpf', to
new added line in 'rte.app.mk', so that it can find the library?

I assume you are building in a system with new kernel, I think you need this for
functionality, where 'xsk.h' is located in that case? Because I was thinking
building and installing libbpf can solve the issue but it is not installing
'xsk.h', not sure why, so not exactly solving.

if you still need "CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf" for your case,
does it make sense update as following:
 CFLAGS += -I$(RTE_KERNELDIR)/tools/lib
 #include <bpf/xsk.h>
  
Xiaolong Ye April 2, 2019, 11:01 p.m. UTC | #4
On 04/02, Stephen Hemminger wrote:
>On Tue,  2 Apr 2019 23:46:53 +0800
>Xiaolong Ye <xiaolong.ye@intel.com> wrote:
>
>> +		/* pull from complete qeueu to leave more space */
>
>Overall looks good, one last spelling error

Sorry for the typo, will fix in in next version.

Thanks,
Xiaolong
  
Xiaolong Ye April 3, 2019, 9:59 a.m. UTC | #5
Hi, Luca

On 04/02, Luca Boccassi wrote:
>On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
>> diff --git a/drivers/net/af_xdp/Makefile
>> b/drivers/net/af_xdp/Makefile
>> new file mode 100644
>> index 000000000..8343e3016
>> --- /dev/null
>> +++ b/drivers/net/af_xdp/Makefile
>> @@ -0,0 +1,32 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright(c) 2019 Intel Corporation
>> +
>> +include $(RTE_SDK)/mk/rte.vars.mk
>> +
>> +#
>> +# library name
>> +#
>> +LIB = librte_pmd_af_xdp.a
>> +
>> +EXPORT_MAP := rte_pmd_af_xdp_version.map
>> +
>> +LIBABIVER := 1
>> +
>> +CFLAGS += -O3
>> +
>> +# require kernel version >= v5.1-rc1
>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
>
>Sorry for not noticing this before, but doesn't this require the full
>kernel tree rather than just the typical headers package? Requiring the
>full kernel tree to be available at build time will make this
>unbuildable on distros that still use makefiles, like RHEL and SUSE. At
>least on Debian and Ubuntu, the kernel headers packages distributed do
>not include the full kernel tree, only the headers, so there's no
>tools/lib or tools/include.

Currently we do have dependencies on the kernel src tree, as xsk.h and
asm/barrier wouldn't be installed by libbpf, so before libbpf handles these
properly, can we keep the current RTE_KERNELDIR in Makefile for now, and mention
the dependencies in document, also suggest users to config RTE_KERNELDIR to correct
kernel src tree if they want to use af_xdp pmd?

Something like:

dependencies:
- kernel source code (>= v5.1-rc1)
- build libbfp and install

Thanks,
Xiaolong
>
>Like other dependencies, this should assume they are installed as
>regular libraries, eg:
>
>CFLAGS += $(shell command -v pkg-config > /dev/null 2>&1 && pkg-config --cflags libbpf || echo "-I/usr/include/bpf")
>
>> +CFLAGS += $(WERROR_FLAGS)
>> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
>> +LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
>> +LDLIBS += -lrte_bus_vdev
>> +LDLIBS += -lbpf
>
>LDLIBS += $(shell command -v pkg-config > /dev/null 2>&1 && pkg-config --libs libbpf || echo "-lbpf")
>
>-- 
>Kind regards,
>Luca Boccassi
  
Luca Boccassi April 3, 2019, 10:36 a.m. UTC | #6
On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
> Hi, Luca
> 
> On 04/02, Luca Boccassi wrote:
> > On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
> > > diff --git a/drivers/net/af_xdp/Makefile
> > > b/drivers/net/af_xdp/Makefile
> > > new file mode 100644
> > > index 000000000..8343e3016
> > > --- /dev/null
> > > +++ b/drivers/net/af_xdp/Makefile
> > > @@ -0,0 +1,32 @@
> > > +# SPDX-License-Identifier: BSD-3-Clause
> > > +# Copyright(c) 2019 Intel Corporation
> > > +
> > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > +
> > > +#
> > > +# library name
> > > +#
> > > +LIB = librte_pmd_af_xdp.a
> > > +
> > > +EXPORT_MAP := rte_pmd_af_xdp_version.map
> > > +
> > > +LIBABIVER := 1
> > > +
> > > +CFLAGS += -O3
> > > +
> > > +# require kernel version >= v5.1-rc1
> > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
> > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
> > 
> > Sorry for not noticing this before, but doesn't this require the
> > full
> > kernel tree rather than just the typical headers package? Requiring
> > the
> > full kernel tree to be available at build time will make this
> > unbuildable on distros that still use makefiles, like RHEL and
> > SUSE. At
> > least on Debian and Ubuntu, the kernel headers packages distributed
> > do
> > not include the full kernel tree, only the headers, so there's no
> > tools/lib or tools/include.
> 
> Currently we do have dependencies on the kernel src tree, as xsk.h
> and
> asm/barrier wouldn't be installed by libbpf, so before libbpf handles
> these
> properly, can we keep the current RTE_KERNELDIR in Makefile for now,
> and mention
> the dependencies in document, also suggest users to config
> RTE_KERNELDIR to correct
> kernel src tree if they want to use af_xdp pmd?
> 
> Something like:
> 
> dependencies:
> - kernel source code (>= v5.1-rc1)
> - build libbfp and install
> 
> Thanks,
> Xiaolong

asm/barrier.h is installed by the kernel headers packages so it would
be fine (although not ideal) and not need the full source tree.
xsk.h is a bit more worrying, as it looks like an internal header from
here.

Is it really necessary for external applications to use an internal-
only header and a kernel header to be able to use libbpf?
  
Luca Boccassi April 3, 2019, 10:42 a.m. UTC | #7
On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
> On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
> > Hi, Luca
> > 
> > On 04/02, Luca Boccassi wrote:
> > > On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
> > > > diff --git a/drivers/net/af_xdp/Makefile
> > > > b/drivers/net/af_xdp/Makefile
> > > > new file mode 100644
> > > > index 000000000..8343e3016
> > > > --- /dev/null
> > > > +++ b/drivers/net/af_xdp/Makefile
> > > > @@ -0,0 +1,32 @@
> > > > +# SPDX-License-Identifier: BSD-3-Clause
> > > > +# Copyright(c) 2019 Intel Corporation
> > > > +
> > > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > > +
> > > > +#
> > > > +# library name
> > > > +#
> > > > +LIB = librte_pmd_af_xdp.a
> > > > +
> > > > +EXPORT_MAP := rte_pmd_af_xdp_version.map
> > > > +
> > > > +LIBABIVER := 1
> > > > +
> > > > +CFLAGS += -O3
> > > > +
> > > > +# require kernel version >= v5.1-rc1
> > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
> > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
> > > 
> > > Sorry for not noticing this before, but doesn't this require the
> > > full
> > > kernel tree rather than just the typical headers package?
> > > Requiring
> > > the
> > > full kernel tree to be available at build time will make this
> > > unbuildable on distros that still use makefiles, like RHEL and
> > > SUSE. At
> > > least on Debian and Ubuntu, the kernel headers packages
> > > distributed
> > > do
> > > not include the full kernel tree, only the headers, so there's no
> > > tools/lib or tools/include.
> > 
> > Currently we do have dependencies on the kernel src tree, as xsk.h
> > and
> > asm/barrier wouldn't be installed by libbpf, so before libbpf
> > handles
> > these
> > properly, can we keep the current RTE_KERNELDIR in Makefile for
> > now,
> > and mention
> > the dependencies in document, also suggest users to config
> > RTE_KERNELDIR to correct
> > kernel src tree if they want to use af_xdp pmd?
> > 
> > Something like:
> > 
> > dependencies:
> > - kernel source code (>= v5.1-rc1)
> > - build libbfp and install
> > 
> > Thanks,
> > Xiaolong
> 
> asm/barrier.h is installed by the kernel headers packages so it would
> be fine (although not ideal) and not need the full source tree.
> xsk.h is a bit more worrying, as it looks like an internal header
> from
> here.
> 
> Is it really necessary for external applications to use an internal-
> only header and a kernel header to be able to use libbpf?

Actually, xsk.h is now installed by the library makefile:

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b

So the full kernel source tree is no longer required.

Is asm/barrier.h really required? Isn't there an userspace alternative?

Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
userspace header so it is not covered by the userspace exception, which
means at the very least the af_xdp PMD shared object is also licensed
under GPL-2.0 only, isn't it?
  
Ferruh Yigit April 3, 2019, 11:18 a.m. UTC | #8
On 4/3/2019 11:42 AM, Luca Boccassi wrote:
> On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
>> On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
>>> Hi, Luca
>>>
>>> On 04/02, Luca Boccassi wrote:
>>>> On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
>>>>> diff --git a/drivers/net/af_xdp/Makefile
>>>>> b/drivers/net/af_xdp/Makefile
>>>>> new file mode 100644
>>>>> index 000000000..8343e3016
>>>>> --- /dev/null
>>>>> +++ b/drivers/net/af_xdp/Makefile
>>>>> @@ -0,0 +1,32 @@
>>>>> +# SPDX-License-Identifier: BSD-3-Clause
>>>>> +# Copyright(c) 2019 Intel Corporation
>>>>> +
>>>>> +include $(RTE_SDK)/mk/rte.vars.mk
>>>>> +
>>>>> +#
>>>>> +# library name
>>>>> +#
>>>>> +LIB = librte_pmd_af_xdp.a
>>>>> +
>>>>> +EXPORT_MAP := rte_pmd_af_xdp_version.map
>>>>> +
>>>>> +LIBABIVER := 1
>>>>> +
>>>>> +CFLAGS += -O3
>>>>> +
>>>>> +# require kernel version >= v5.1-rc1
>>>>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
>>>>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
>>>>
>>>> Sorry for not noticing this before, but doesn't this require the
>>>> full
>>>> kernel tree rather than just the typical headers package?
>>>> Requiring
>>>> the
>>>> full kernel tree to be available at build time will make this
>>>> unbuildable on distros that still use makefiles, like RHEL and
>>>> SUSE. At
>>>> least on Debian and Ubuntu, the kernel headers packages
>>>> distributed
>>>> do
>>>> not include the full kernel tree, only the headers, so there's no
>>>> tools/lib or tools/include.
>>>
>>> Currently we do have dependencies on the kernel src tree, as xsk.h
>>> and
>>> asm/barrier wouldn't be installed by libbpf, so before libbpf
>>> handles
>>> these
>>> properly, can we keep the current RTE_KERNELDIR in Makefile for
>>> now,
>>> and mention
>>> the dependencies in document, also suggest users to config
>>> RTE_KERNELDIR to correct
>>> kernel src tree if they want to use af_xdp pmd?
>>>
>>> Something like:
>>>
>>> dependencies:
>>> - kernel source code (>= v5.1-rc1)
>>> - build libbfp and install
>>>
>>> Thanks,
>>> Xiaolong
>>
>> asm/barrier.h is installed by the kernel headers packages so it would
>> be fine (although not ideal) and not need the full source tree.
>> xsk.h is a bit more worrying, as it looks like an internal header
>> from
>> here.
>>
>> Is it really necessary for external applications to use an internal-
>> only header and a kernel header to be able to use libbpf?
> 
> Actually, xsk.h is now installed by the library makefile:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b

Good to have this one. But again it is in BPF tree and it won't be in 5.1.

I suggested changing code as following for now, it would help to keep changes
small when above patch merged into kernel:
 CFLAGS += -I$(RTE_KERNELDIR)/tools/lib [in makefile]
 #include <bpf/xsk.h>                   [in .c file]

> 
> So the full kernel source tree is no longer required.
> 
> Is asm/barrier.h really required? Isn't there an userspace alternative?

The 'asm/barrier.h' in the kernel headers and the 'tools/include/asm/barrier.h'
looks different, the one in the kernel source has dependency to other kernel
headers.

I wonder same thing, what is used from 'tools/include/asm/barrier.h' and if it
can be avoided.

Anyway, as Xiaolong mentioned, following is working, can it work from a distro
point of view:
- get kernel source code (>= v5.1-rc1)
- build libbfp and install
- set 'RTE_KERNELDIR' to point kernel source path
- build dpdk with af_xdp enabled

> 
> Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
> userspace header so it is not covered by the userspace exception, which
> means at the very least the af_xdp PMD shared object is also licensed
> under GPL-2.0 only, isn't it?
>
  
Luca Boccassi April 3, 2019, 11:35 a.m. UTC | #9
On Wed, 2019-04-03 at 12:18 +0100, Ferruh Yigit wrote:
> On 4/3/2019 11:42 AM, Luca Boccassi wrote:
> > On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
> > > On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
> > > > Hi, Luca
> > > > 
> > > > On 04/02, Luca Boccassi wrote:
> > > > > On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
> > > > > > diff --git a/drivers/net/af_xdp/Makefile
> > > > > > b/drivers/net/af_xdp/Makefile
> > > > > > new file mode 100644
> > > > > > index 000000000..8343e3016
> > > > > > --- /dev/null
> > > > > > +++ b/drivers/net/af_xdp/Makefile
> > > > > > @@ -0,0 +1,32 @@
> > > > > > +# SPDX-License-Identifier: BSD-3-Clause
> > > > > > +# Copyright(c) 2019 Intel Corporation
> > > > > > +
> > > > > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > +
> > > > > > +#
> > > > > > +# library name
> > > > > > +#
> > > > > > +LIB = librte_pmd_af_xdp.a
> > > > > > +
> > > > > > +EXPORT_MAP := rte_pmd_af_xdp_version.map
> > > > > > +
> > > > > > +LIBABIVER := 1
> > > > > > +
> > > > > > +CFLAGS += -O3
> > > > > > +
> > > > > > +# require kernel version >= v5.1-rc1
> > > > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
> > > > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
> > > > > 
> > > > > Sorry for not noticing this before, but doesn't this require
> > > > > the
> > > > > full
> > > > > kernel tree rather than just the typical headers package?
> > > > > Requiring
> > > > > the
> > > > > full kernel tree to be available at build time will make this
> > > > > unbuildable on distros that still use makefiles, like RHEL
> > > > > and
> > > > > SUSE. At
> > > > > least on Debian and Ubuntu, the kernel headers packages
> > > > > distributed
> > > > > do
> > > > > not include the full kernel tree, only the headers, so
> > > > > there's no
> > > > > tools/lib or tools/include.
> > > > 
> > > > Currently we do have dependencies on the kernel src tree, as
> > > > xsk.h
> > > > and
> > > > asm/barrier wouldn't be installed by libbpf, so before libbpf
> > > > handles
> > > > these
> > > > properly, can we keep the current RTE_KERNELDIR in Makefile for
> > > > now,
> > > > and mention
> > > > the dependencies in document, also suggest users to config
> > > > RTE_KERNELDIR to correct
> > > > kernel src tree if they want to use af_xdp pmd?
> > > > 
> > > > Something like:
> > > > 
> > > > dependencies:
> > > > - kernel source code (>= v5.1-rc1)
> > > > - build libbfp and install
> > > > 
> > > > Thanks,
> > > > Xiaolong
> > > 
> > > asm/barrier.h is installed by the kernel headers packages so it
> > > would
> > > be fine (although not ideal) and not need the full source tree.
> > > xsk.h is a bit more worrying, as it looks like an internal header
> > > from
> > > here.
> > > 
> > > Is it really necessary for external applications to use an
> > > internal-
> > > only header and a kernel header to be able to use libbpf?
> > 
> > Actually, xsk.h is now installed by the library makefile:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b
> > 
> 
> Good to have this one. But again it is in BPF tree and it won't be in
> 5.1.

It looks like a small and required bug fix to me, and 5.1 is still in
RC state, so perhaps there's still time.

Bjorn and Magnus, any chance the above makefile install fix could be
sent for inclusion in 5.1-rc4?

> I suggested changing code as following for now, it would help to keep
> changes
> small when above patch merged into kernel:
>  CFLAGS += -I$(RTE_KERNELDIR)/tools/lib [in makefile]
>  #include <bpf/xsk.h>                   [in .c file]
> 
> > So the full kernel source tree is no longer required.
> > 
> > Is asm/barrier.h really required? Isn't there an userspace
> > alternative?
> 
> The 'asm/barrier.h' in the kernel headers and the
> 'tools/include/asm/barrier.h'
> looks different, the one in the kernel source has dependency to other
> kernel
> headers.
> 
> I wonder same thing, what is used from 'tools/include/asm/barrier.h'
> and if it
> can be avoided.

The one in tools/include also is GPL-2.0 only so it cannot be included
from the PMD, which is BSD-3-clause only (and it recursively includes
the other arch-specific kernel headers)

> Anyway, as Xiaolong mentioned, following is working, can it work from
> a distro
> point of view:
> - get kernel source code (>= v5.1-rc1)
> - build libbfp and install
> - set 'RTE_KERNELDIR' to point kernel source path
> - build dpdk with af_xdp enabled

As long as the full kernel tree is required, we cannot enable it in
Debian and Ubuntu - we can't have it at build time on the build
workers, and also there's the licensing problem.

> > Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
> > userspace header so it is not covered by the userspace exception,
> > which
> > means at the very least the af_xdp PMD shared object is also
> > licensed
> > under GPL-2.0 only, isn't it?
> > 
> 
>
  
Luca Boccassi April 3, 2019, 12:16 p.m. UTC | #10
On Wed, 2019-04-03 at 12:35 +0100, Luca Boccassi wrote:
> On Wed, 2019-04-03 at 12:18 +0100, Ferruh Yigit wrote:
> > On 4/3/2019 11:42 AM, Luca Boccassi wrote:
> > > On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
> > > > On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
> > > > > Hi, Luca
> > > > > 
> > > > > On 04/02, Luca Boccassi wrote:
> > > > > > On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
> > > > > > > diff --git a/drivers/net/af_xdp/Makefile
> > > > > > > b/drivers/net/af_xdp/Makefile
> > > > > > > new file mode 100644
> > > > > > > index 000000000..8343e3016
> > > > > > > --- /dev/null
> > > > > > > +++ b/drivers/net/af_xdp/Makefile
> > > > > > > @@ -0,0 +1,32 @@
> > > > > > > +# SPDX-License-Identifier: BSD-3-Clause
> > > > > > > +# Copyright(c) 2019 Intel Corporation
> > > > > > > +
> > > > > > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > +
> > > > > > > +#
> > > > > > > +# library name
> > > > > > > +#
> > > > > > > +LIB = librte_pmd_af_xdp.a
> > > > > > > +
> > > > > > > +EXPORT_MAP := rte_pmd_af_xdp_version.map
> > > > > > > +
> > > > > > > +LIBABIVER := 1
> > > > > > > +
> > > > > > > +CFLAGS += -O3
> > > > > > > +
> > > > > > > +# require kernel version >= v5.1-rc1
> > > > > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
> > > > > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
> > > > > > 
> > > > > > Sorry for not noticing this before, but doesn't this
> > > > > > require
> > > > > > the
> > > > > > full
> > > > > > kernel tree rather than just the typical headers package?
> > > > > > Requiring
> > > > > > the
> > > > > > full kernel tree to be available at build time will make
> > > > > > this
> > > > > > unbuildable on distros that still use makefiles, like RHEL
> > > > > > and
> > > > > > SUSE. At
> > > > > > least on Debian and Ubuntu, the kernel headers packages
> > > > > > distributed
> > > > > > do
> > > > > > not include the full kernel tree, only the headers, so
> > > > > > there's no
> > > > > > tools/lib or tools/include.
> > > > > 
> > > > > Currently we do have dependencies on the kernel src tree, as
> > > > > xsk.h
> > > > > and
> > > > > asm/barrier wouldn't be installed by libbpf, so before libbpf
> > > > > handles
> > > > > these
> > > > > properly, can we keep the current RTE_KERNELDIR in Makefile
> > > > > for
> > > > > now,
> > > > > and mention
> > > > > the dependencies in document, also suggest users to config
> > > > > RTE_KERNELDIR to correct
> > > > > kernel src tree if they want to use af_xdp pmd?
> > > > > 
> > > > > Something like:
> > > > > 
> > > > > dependencies:
> > > > > - kernel source code (>= v5.1-rc1)
> > > > > - build libbfp and install
> > > > > 
> > > > > Thanks,
> > > > > Xiaolong
> > > > 
> > > > asm/barrier.h is installed by the kernel headers packages so it
> > > > would
> > > > be fine (although not ideal) and not need the full source tree.
> > > > xsk.h is a bit more worrying, as it looks like an internal
> > > > header
> > > > from
> > > > here.
> > > > 
> > > > Is it really necessary for external applications to use an
> > > > internal-
> > > > only header and a kernel header to be able to use libbpf?
> > > 
> > > Actually, xsk.h is now installed by the library makefile:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b
> > > 
> > > 
> > 
> > Good to have this one. But again it is in BPF tree and it won't be
> > in
> > 5.1.
> 
> It looks like a small and required bug fix to me, and 5.1 is still in
> RC state, so perhaps there's still time.
> 
> Bjorn and Magnus, any chance the above makefile install fix could be
> sent for inclusion in 5.1-rc4?

Actually the bpf tree was already merged in the net tree a couple of
days ago. As far as I understand from the process, this should mean
that this fix should be set for inclusion in Linus' tree in time for
5.1-rc4:

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit?id=379e2014c95b7a454713da822b8ef4ec51ab8a75
  
Ferruh Yigit April 3, 2019, 12:33 p.m. UTC | #11
On 4/3/2019 1:16 PM, Luca Boccassi wrote:
> On Wed, 2019-04-03 at 12:35 +0100, Luca Boccassi wrote:
>> On Wed, 2019-04-03 at 12:18 +0100, Ferruh Yigit wrote:
>>> On 4/3/2019 11:42 AM, Luca Boccassi wrote:
>>>> On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
>>>>> On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
>>>>>> Hi, Luca
>>>>>>
>>>>>> On 04/02, Luca Boccassi wrote:
>>>>>>> On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
>>>>>>>> diff --git a/drivers/net/af_xdp/Makefile
>>>>>>>> b/drivers/net/af_xdp/Makefile
>>>>>>>> new file mode 100644
>>>>>>>> index 000000000..8343e3016
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/drivers/net/af_xdp/Makefile
>>>>>>>> @@ -0,0 +1,32 @@
>>>>>>>> +# SPDX-License-Identifier: BSD-3-Clause
>>>>>>>> +# Copyright(c) 2019 Intel Corporation
>>>>>>>> +
>>>>>>>> +include $(RTE_SDK)/mk/rte.vars.mk
>>>>>>>> +
>>>>>>>> +#
>>>>>>>> +# library name
>>>>>>>> +#
>>>>>>>> +LIB = librte_pmd_af_xdp.a
>>>>>>>> +
>>>>>>>> +EXPORT_MAP := rte_pmd_af_xdp_version.map
>>>>>>>> +
>>>>>>>> +LIBABIVER := 1
>>>>>>>> +
>>>>>>>> +CFLAGS += -O3
>>>>>>>> +
>>>>>>>> +# require kernel version >= v5.1-rc1
>>>>>>>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
>>>>>>>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
>>>>>>>
>>>>>>> Sorry for not noticing this before, but doesn't this
>>>>>>> require
>>>>>>> the
>>>>>>> full
>>>>>>> kernel tree rather than just the typical headers package?
>>>>>>> Requiring
>>>>>>> the
>>>>>>> full kernel tree to be available at build time will make
>>>>>>> this
>>>>>>> unbuildable on distros that still use makefiles, like RHEL
>>>>>>> and
>>>>>>> SUSE. At
>>>>>>> least on Debian and Ubuntu, the kernel headers packages
>>>>>>> distributed
>>>>>>> do
>>>>>>> not include the full kernel tree, only the headers, so
>>>>>>> there's no
>>>>>>> tools/lib or tools/include.
>>>>>>
>>>>>> Currently we do have dependencies on the kernel src tree, as
>>>>>> xsk.h
>>>>>> and
>>>>>> asm/barrier wouldn't be installed by libbpf, so before libbpf
>>>>>> handles
>>>>>> these
>>>>>> properly, can we keep the current RTE_KERNELDIR in Makefile
>>>>>> for
>>>>>> now,
>>>>>> and mention
>>>>>> the dependencies in document, also suggest users to config
>>>>>> RTE_KERNELDIR to correct
>>>>>> kernel src tree if they want to use af_xdp pmd?
>>>>>>
>>>>>> Something like:
>>>>>>
>>>>>> dependencies:
>>>>>> - kernel source code (>= v5.1-rc1)
>>>>>> - build libbfp and install
>>>>>>
>>>>>> Thanks,
>>>>>> Xiaolong
>>>>>
>>>>> asm/barrier.h is installed by the kernel headers packages so it
>>>>> would
>>>>> be fine (although not ideal) and not need the full source tree.
>>>>> xsk.h is a bit more worrying, as it looks like an internal
>>>>> header
>>>>> from
>>>>> here.
>>>>>
>>>>> Is it really necessary for external applications to use an
>>>>> internal-
>>>>> only header and a kernel header to be able to use libbpf?
>>>>
>>>> Actually, xsk.h is now installed by the library makefile:
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b
>>>>
>>>>
>>>
>>> Good to have this one. But again it is in BPF tree and it won't be
>>> in
>>> 5.1.
>>
>> It looks like a small and required bug fix to me, and 5.1 is still in
>> RC state, so perhaps there's still time.
>>
>> Bjorn and Magnus, any chance the above makefile install fix could be
>> sent for inclusion in 5.1-rc4?
> 
> Actually the bpf tree was already merged in the net tree a couple of
> days ago. As far as I understand from the process, this should mean
> that this fix should be set for inclusion in Linus' tree in time for
> 5.1-rc4:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit?id=379e2014c95b7a454713da822b8ef4ec51ab8a75
> 

My bad, it seems 'bpf' tree can be merged into 'net' tree, but also there is
another 'bpf-next' for further releases.

So I believe it is OK to expect the fixes you pointed in 'bpf' tree to be in 5.1
  
Ferruh Yigit April 3, 2019, 1:09 p.m. UTC | #12
On 4/3/2019 12:35 PM, Luca Boccassi wrote:
> On Wed, 2019-04-03 at 12:18 +0100, Ferruh Yigit wrote:
>> On 4/3/2019 11:42 AM, Luca Boccassi wrote:
>>> On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
>>>> On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
>>>>> Hi, Luca
>>>>>
>>>>> On 04/02, Luca Boccassi wrote:
>>>>>> On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
>>>>>>> diff --git a/drivers/net/af_xdp/Makefile
>>>>>>> b/drivers/net/af_xdp/Makefile
>>>>>>> new file mode 100644
>>>>>>> index 000000000..8343e3016
>>>>>>> --- /dev/null
>>>>>>> +++ b/drivers/net/af_xdp/Makefile
>>>>>>> @@ -0,0 +1,32 @@
>>>>>>> +# SPDX-License-Identifier: BSD-3-Clause
>>>>>>> +# Copyright(c) 2019 Intel Corporation
>>>>>>> +
>>>>>>> +include $(RTE_SDK)/mk/rte.vars.mk
>>>>>>> +
>>>>>>> +#
>>>>>>> +# library name
>>>>>>> +#
>>>>>>> +LIB = librte_pmd_af_xdp.a
>>>>>>> +
>>>>>>> +EXPORT_MAP := rte_pmd_af_xdp_version.map
>>>>>>> +
>>>>>>> +LIBABIVER := 1
>>>>>>> +
>>>>>>> +CFLAGS += -O3
>>>>>>> +
>>>>>>> +# require kernel version >= v5.1-rc1
>>>>>>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
>>>>>>> +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
>>>>>>
>>>>>> Sorry for not noticing this before, but doesn't this require
>>>>>> the
>>>>>> full
>>>>>> kernel tree rather than just the typical headers package?
>>>>>> Requiring
>>>>>> the
>>>>>> full kernel tree to be available at build time will make this
>>>>>> unbuildable on distros that still use makefiles, like RHEL
>>>>>> and
>>>>>> SUSE. At
>>>>>> least on Debian and Ubuntu, the kernel headers packages
>>>>>> distributed
>>>>>> do
>>>>>> not include the full kernel tree, only the headers, so
>>>>>> there's no
>>>>>> tools/lib or tools/include.
>>>>>
>>>>> Currently we do have dependencies on the kernel src tree, as
>>>>> xsk.h
>>>>> and
>>>>> asm/barrier wouldn't be installed by libbpf, so before libbpf
>>>>> handles
>>>>> these
>>>>> properly, can we keep the current RTE_KERNELDIR in Makefile for
>>>>> now,
>>>>> and mention
>>>>> the dependencies in document, also suggest users to config
>>>>> RTE_KERNELDIR to correct
>>>>> kernel src tree if they want to use af_xdp pmd?
>>>>>
>>>>> Something like:
>>>>>
>>>>> dependencies:
>>>>> - kernel source code (>= v5.1-rc1)
>>>>> - build libbfp and install
>>>>>
>>>>> Thanks,
>>>>> Xiaolong
>>>>
>>>> asm/barrier.h is installed by the kernel headers packages so it
>>>> would
>>>> be fine (although not ideal) and not need the full source tree.
>>>> xsk.h is a bit more worrying, as it looks like an internal header
>>>> from
>>>> here.
>>>>
>>>> Is it really necessary for external applications to use an
>>>> internal-
>>>> only header and a kernel header to be able to use libbpf?
>>>
>>> Actually, xsk.h is now installed by the library makefile:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b
>>>
>>
>> Good to have this one. But again it is in BPF tree and it won't be in
>> 5.1.
> 
> It looks like a small and required bug fix to me, and 5.1 is still in
> RC state, so perhaps there's still time.
> 
> Bjorn and Magnus, any chance the above makefile install fix could be
> sent for inclusion in 5.1-rc4?
> 
>> I suggested changing code as following for now, it would help to keep
>> changes
>> small when above patch merged into kernel:
>>  CFLAGS += -I$(RTE_KERNELDIR)/tools/lib [in makefile]
>>  #include <bpf/xsk.h>                   [in .c file]
>>
>>> So the full kernel source tree is no longer required.
>>>
>>> Is asm/barrier.h really required? Isn't there an userspace
>>> alternative?
>>
>> The 'asm/barrier.h' in the kernel headers and the
>> 'tools/include/asm/barrier.h'
>> looks different, the one in the kernel source has dependency to other
>> kernel
>> headers.
>>
>> I wonder same thing, what is used from 'tools/include/asm/barrier.h'
>> and if it
>> can be avoided.

It seems, 'tools/include/asm/barrier.h' is required for 'smp_wmb()' &
'smp_rmb()' in 'xsk.h'.
We have equivalents of these in DPDK [1], and perhaps it can be possible to use
them and not include this header at all.

in 'rte_eth_af_xdp.c', before including 'xsk.h', we can include an local
compatibility header which does following should work:
#define smp_rmb() rte_rmb()
#define smp_wmb() rte_wmb()

@Xiaolong, what do you think?

[1]
https://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/arch/x86/rte_atomic.h?h=v19.02#n30

> 
> The one in tools/include also is GPL-2.0 only so it cannot be included
> from the PMD, which is BSD-3-clause only (and it recursively includes
> the other arch-specific kernel headers)
> 
>> Anyway, as Xiaolong mentioned, following is working, can it work from
>> a distro
>> point of view:
>> - get kernel source code (>= v5.1-rc1)
>> - build libbfp and install
>> - set 'RTE_KERNELDIR' to point kernel source path
>> - build dpdk with af_xdp enabled
> 
> As long as the full kernel tree is required, we cannot enable it in
> Debian and Ubuntu - we can't have it at build time on the build
> workers, and also there's the licensing problem.

Got it.

In above steps, 'libbpf' also build from kernel source tree, will it be problem
in you builds to not have it build from source?

If not, taking into account that xsk.h also will be fixed, only
'tools/include/asm/barrier.h' remains the problem, and it looks like it can be
solved, please check above.


> 
>>> Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
>>> userspace header so it is not covered by the userspace exception,
>>> which
>>> means at the very least the af_xdp PMD shared object is also
>>> licensed
>>> under GPL-2.0 only, isn't it?
>>>
>>
>>
  
Bruce Richardson April 3, 2019, 1:22 p.m. UTC | #13
On Tue, Apr 02, 2019 at 08:43:48PM +0100, Ferruh Yigit wrote:
> On 4/2/2019 4:46 PM, Xiaolong Ye wrote:
> > Add a new PMD driver for AF_XDP which is a proposed faster version of
> > AF_PACKET interface in Linux. More info about AF_XDP, please refer to [1]
> > [2].
> > 
> > This is the vanilla version PMD which just uses a raw buffer registered as
> > the umem.
> > 
> > [1] https://fosdem.org/2018/schedule/event/af_xdp/
> > [2] https://lwn.net/Articles/745934/
> > 
> > Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
> 
> <...>
> 
> > @@ -0,0 +1,956 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2019 Intel Corporation.
> > + */
> 
> > +#include <bpf/bpf.h>
> > +#include <xsk.h>
> 
> Under linux, both headers are in same 'bpf' folder, why one included as
> 'bpf/bpf.h' but other 'xsk.h'?
> 
> Perhaps this is not problem when headers are installed into system folders, but
> I am compiling using RTE_KERNELDIR, which used in Makefile as:
>  -I$(RTE_KERNELDIR)/tools/lib/bpf

When installed in system folders they will still need the "bpf" prefix. On
my system after running "make headers_install" in libbpf folder, the
headers are placed in "/usr/local/include/bpf/"

> 
> This fails to find 'bpf/bpf.h'
> 
> Also for '-lbpf', shouldn't need to add '-L$(RTE_KERNELDIR)/tools/lib/bpf', to
> new added line in 'rte.app.mk', so that it can find the library?
> 
> I assume you are building in a system with new kernel, I think you need this for
> functionality, where 'xsk.h' is located in that case? Because I was thinking
> building and installing libbpf can solve the issue but it is not installing
> 'xsk.h', not sure why, so not exactly solving.
> 
> if you still need "CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf" for your case,
> does it make sense update as following:
>  CFLAGS += -I$(RTE_KERNELDIR)/tools/lib
>  #include <bpf/xsk.h>

We should not include in any driver a cflag or ldflag that points to the
kernel dir. We should expect the headers for libbpf in a regular include
folder and the library itself in /usr/lib or /usr/local/lib.

/Bruce
  
Luca Boccassi April 3, 2019, 1:29 p.m. UTC | #14
On Wed, 2019-04-03 at 14:09 +0100, Ferruh Yigit wrote:
> On 4/3/2019 12:35 PM, Luca Boccassi wrote:
> > On Wed, 2019-04-03 at 12:18 +0100, Ferruh Yigit wrote:
> > > On 4/3/2019 11:42 AM, Luca Boccassi wrote:
> > > > On Wed, 2019-04-03 at 11:36 +0100, Luca Boccassi wrote:
> > > > > On Wed, 2019-04-03 at 17:59 +0800, Ye Xiaolong wrote:
> > > > > > Hi, Luca
> > > > > > 
> > > > > > On 04/02, Luca Boccassi wrote:
> > > > > > > On Tue, 2019-04-02 at 23:46 +0800, Xiaolong Ye wrote:
> > > > > > > > diff --git a/drivers/net/af_xdp/Makefile
> > > > > > > > b/drivers/net/af_xdp/Makefile
> > > > > > > > new file mode 100644
> > > > > > > > index 000000000..8343e3016
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/drivers/net/af_xdp/Makefile
> > > > > > > > @@ -0,0 +1,32 @@
> > > > > > > > +# SPDX-License-Identifier: BSD-3-Clause
> > > > > > > > +# Copyright(c) 2019 Intel Corporation
> > > > > > > > +
> > > > > > > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > +
> > > > > > > > +#
> > > > > > > > +# library name
> > > > > > > > +#
> > > > > > > > +LIB = librte_pmd_af_xdp.a
> > > > > > > > +
> > > > > > > > +EXPORT_MAP := rte_pmd_af_xdp_version.map
> > > > > > > > +
> > > > > > > > +LIBABIVER := 1
> > > > > > > > +
> > > > > > > > +CFLAGS += -O3
> > > > > > > > +
> > > > > > > > +# require kernel version >= v5.1-rc1
> > > > > > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/include
> > > > > > > > +CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
> > > > > > > 
> > > > > > > Sorry for not noticing this before, but doesn't this
> > > > > > > require
> > > > > > > the
> > > > > > > full
> > > > > > > kernel tree rather than just the typical headers package?
> > > > > > > Requiring
> > > > > > > the
> > > > > > > full kernel tree to be available at build time will make
> > > > > > > this
> > > > > > > unbuildable on distros that still use makefiles, like
> > > > > > > RHEL
> > > > > > > and
> > > > > > > SUSE. At
> > > > > > > least on Debian and Ubuntu, the kernel headers packages
> > > > > > > distributed
> > > > > > > do
> > > > > > > not include the full kernel tree, only the headers, so
> > > > > > > there's no
> > > > > > > tools/lib or tools/include.
> > > > > > 
> > > > > > Currently we do have dependencies on the kernel src tree,
> > > > > > as
> > > > > > xsk.h
> > > > > > and
> > > > > > asm/barrier wouldn't be installed by libbpf, so before
> > > > > > libbpf
> > > > > > handles
> > > > > > these
> > > > > > properly, can we keep the current RTE_KERNELDIR in Makefile
> > > > > > for
> > > > > > now,
> > > > > > and mention
> > > > > > the dependencies in document, also suggest users to config
> > > > > > RTE_KERNELDIR to correct
> > > > > > kernel src tree if they want to use af_xdp pmd?
> > > > > > 
> > > > > > Something like:
> > > > > > 
> > > > > > dependencies:
> > > > > > - kernel source code (>= v5.1-rc1)
> > > > > > - build libbfp and install
> > > > > > 
> > > > > > Thanks,
> > > > > > Xiaolong
> > > > > 
> > > > > asm/barrier.h is installed by the kernel headers packages so
> > > > > it
> > > > > would
> > > > > be fine (although not ideal) and not need the full source
> > > > > tree.
> > > > > xsk.h is a bit more worrying, as it looks like an internal
> > > > > header
> > > > > from
> > > > > here.
> > > > > 
> > > > > Is it really necessary for external applications to use an
> > > > > internal-
> > > > > only header and a kernel header to be able to use libbpf?
> > > > 
> > > > Actually, xsk.h is now installed by the library makefile:
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=379e2014c95b
> > > > 
> > > > 
> > > 
> > > Good to have this one. But again it is in BPF tree and it won't
> > > be in
> > > 5.1.
> > 
> > It looks like a small and required bug fix to me, and 5.1 is still
> > in
> > RC state, so perhaps there's still time.
> > 
> > Bjorn and Magnus, any chance the above makefile install fix could
> > be
> > sent for inclusion in 5.1-rc4?
> > 
> > > I suggested changing code as following for now, it would help to
> > > keep
> > > changes
> > > small when above patch merged into kernel:
> > >  CFLAGS += -I$(RTE_KERNELDIR)/tools/lib [in makefile]
> > >  #include <bpf/xsk.h>                   [in .c file]
> > > 
> > > > So the full kernel source tree is no longer required.
> > > > 
> > > > Is asm/barrier.h really required? Isn't there an userspace
> > > > alternative?
> > > 
> > > The 'asm/barrier.h' in the kernel headers and the
> > > 'tools/include/asm/barrier.h'
> > > looks different, the one in the kernel source has dependency to
> > > other
> > > kernel
> > > headers.
> > > 
> > > I wonder same thing, what is used from
> > > 'tools/include/asm/barrier.h'
> > > and if it
> > > can be avoided.
> 
> It seems, 'tools/include/asm/barrier.h' is required for 'smp_wmb()' &
> 'smp_rmb()' in 'xsk.h'.
> We have equivalents of these in DPDK [1], and perhaps it can be
> possible to use
> them and not include this header at all.
> 
> in 'rte_eth_af_xdp.c', before including 'xsk.h', we can include an
> local
> compatibility header which does following should work:
> #define smp_rmb() rte_rmb()
> #define smp_wmb() rte_wmb()
> 
> @Xiaolong, what do you think?
> 
> [1]
> https://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/arch/x86/rte_atomic.h?h=v19.02#n30

Perfect, that looks like a great solution for the PMD.

For the broader picture, now that xsk.h is a public userspace header,
it should at some point in the future be fixed to avoid depending on an
internal kernel definition for the barriers, and either ship their own
or depend on another public header that provides them. Otherwise every
application that wants to use bpf with xdp needs to provide their own
implementation - but this is not relanted to this patchset and we can
live without for the moment in DPDK.

> > The one in tools/include also is GPL-2.0 only so it cannot be
> > included
> > from the PMD, which is BSD-3-clause only (and it recursively
> > includes
> > the other arch-specific kernel headers)
> > 
> > > Anyway, as Xiaolong mentioned, following is working, can it work
> > > from
> > > a distro
> > > point of view:
> > > - get kernel source code (>= v5.1-rc1)
> > > - build libbfp and install
> > > - set 'RTE_KERNELDIR' to point kernel source path
> > > - build dpdk with af_xdp enabled
> > 
> > As long as the full kernel tree is required, we cannot enable it in
> > Debian and Ubuntu - we can't have it at build time on the build
> > workers, and also there's the licensing problem.
> 
> Got it.
> 
> In above steps, 'libbpf' also build from kernel source tree, will it
> be problem
> in you builds to not have it build from source?
> 
> If not, taking into account that xsk.h also will be fixed, only
> 'tools/include/asm/barrier.h' remains the problem, and it looks like
> it can be
> solved, please check above.

libbpf is already packaged separately in Debian and I think other
distros will follow soon, so it's all good for me once the barrier
issue is solved.

https://packages.debian.org/buster/libbpf-dev

From the makefile's perspective it should not matter where it comes
from - the headers should be expected to be in /usr/include and the
library in /usr/lib* - and pkg-config can help with that if available.
And if a user wants to use a custom path, then it's no different than
any of the other dependencies on other external libraries
  
Ferruh Yigit April 3, 2019, 1:34 p.m. UTC | #15
On 4/3/2019 2:22 PM, Bruce Richardson wrote:
> On Tue, Apr 02, 2019 at 08:43:48PM +0100, Ferruh Yigit wrote:
>> On 4/2/2019 4:46 PM, Xiaolong Ye wrote:
>>> Add a new PMD driver for AF_XDP which is a proposed faster version of
>>> AF_PACKET interface in Linux. More info about AF_XDP, please refer to [1]
>>> [2].
>>>
>>> This is the vanilla version PMD which just uses a raw buffer registered as
>>> the umem.
>>>
>>> [1] https://fosdem.org/2018/schedule/event/af_xdp/
>>> [2] https://lwn.net/Articles/745934/
>>>
>>> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
>>
>> <...>
>>
>>> @@ -0,0 +1,956 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2019 Intel Corporation.
>>> + */
>>
>>> +#include <bpf/bpf.h>
>>> +#include <xsk.h>
>>
>> Under linux, both headers are in same 'bpf' folder, why one included as
>> 'bpf/bpf.h' but other 'xsk.h'?
>>
>> Perhaps this is not problem when headers are installed into system folders, but
>> I am compiling using RTE_KERNELDIR, which used in Makefile as:
>>  -I$(RTE_KERNELDIR)/tools/lib/bpf
> 
> When installed in system folders they will still need the "bpf" prefix. On
> my system after running "make headers_install" in libbpf folder, the
> headers are placed in "/usr/local/include/bpf/"

This is for 'xsk.h' which was not installed via "make headers_install", but as
Luca pointed out there is a patch to install 'xsk.h' too, so it should be OK to
remove that line.

> 
>>
>> This fails to find 'bpf/bpf.h'
>>
>> Also for '-lbpf', shouldn't need to add '-L$(RTE_KERNELDIR)/tools/lib/bpf', to
>> new added line in 'rte.app.mk', so that it can find the library?
>>
>> I assume you are building in a system with new kernel, I think you need this for
>> functionality, where 'xsk.h' is located in that case? Because I was thinking
>> building and installing libbpf can solve the issue but it is not installing
>> 'xsk.h', not sure why, so not exactly solving.
>>
>> if you still need "CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf" for your case,
>> does it make sense update as following:
>>  CFLAGS += -I$(RTE_KERNELDIR)/tools/lib
>>  #include <bpf/xsk.h>
> 
> We should not include in any driver a cflag or ldflag that points to the
> kernel dir. We should expect the headers for libbpf in a regular include
> folder and the library itself in /usr/lib or /usr/local/lib.

Overall agree, but there is a dependency from 'xsk.h' to
'tools/include/asm/barrier.h', and this header file is not installed into system
folders, it can be found only in kernel source code.

Hopefully it looks like there is a way to get rid of
'tools/include/asm/barrier.h' dependency for DPDK, so we can remove that cflags too.
  
Xiaolong Ye April 3, 2019, 2:22 p.m. UTC | #16
On 04/03, Ferruh Yigit wrote:
[snip]
>
>It seems, 'tools/include/asm/barrier.h' is required for 'smp_wmb()' &
>'smp_rmb()' in 'xsk.h'.
>We have equivalents of these in DPDK [1], and perhaps it can be possible to use
>them and not include this header at all.
>
>in 'rte_eth_af_xdp.c', before including 'xsk.h', we can include an local
>compatibility header which does following should work:
>#define smp_rmb() rte_rmb()
>#define smp_wmb() rte_wmb()
>
>@Xiaolong, what do you think?

It sounds perfect to me, I'll take it in my next version.
Something to confirm, So we can now assume af_xdp pmd user would use kernel (say v5.1-rc4) 
that contains fixes regarding to xsk.h and libelf, I still need to do following
changes.

1. I shall use <bpf/xsk.h> as xsk.h should be installed in system folders.
2. `-lelf` is not needed in rte.app.mk
3. I need to document the libbpf build and install steps in af_xdp.rst
4. add the above two defines before including xsk.h

Thanks,
Xiaolong


>
>[1]
>https://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/arch/x86/rte_atomic.h?h=v19.02#n30
>
>> 
>> The one in tools/include also is GPL-2.0 only so it cannot be included
>> from the PMD, which is BSD-3-clause only (and it recursively includes
>> the other arch-specific kernel headers)
>> 
>>> Anyway, as Xiaolong mentioned, following is working, can it work from
>>> a distro
>>> point of view:
>>> - get kernel source code (>= v5.1-rc1)
>>> - build libbfp and install
>>> - set 'RTE_KERNELDIR' to point kernel source path
>>> - build dpdk with af_xdp enabled
>> 
>> As long as the full kernel tree is required, we cannot enable it in
>> Debian and Ubuntu - we can't have it at build time on the build
>> workers, and also there's the licensing problem.
>
>Got it.
>
>In above steps, 'libbpf' also build from kernel source tree, will it be problem
>in you builds to not have it build from source?
>
>If not, taking into account that xsk.h also will be fixed, only
>'tools/include/asm/barrier.h' remains the problem, and it looks like it can be
>solved, please check above.
>
>
>> 
>>>> Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
>>>> userspace header so it is not covered by the userspace exception,
>>>> which
>>>> means at the very least the af_xdp PMD shared object is also
>>>> licensed
>>>> under GPL-2.0 only, isn't it?
>>>>
>>>
>>>
>
  
Xiaolong Ye April 3, 2019, 2:43 p.m. UTC | #17
On 04/03, Luca Boccassi wrote:
[snip]
>> 
>> Got it.
>> 
>> In above steps, 'libbpf' also build from kernel source tree, will it
>> be problem
>> in you builds to not have it build from source?
>> 
>> If not, taking into account that xsk.h also will be fixed, only
>> 'tools/include/asm/barrier.h' remains the problem, and it looks like
>> it can be
>> solved, please check above.
>
>libbpf is already packaged separately in Debian and I think other
>distros will follow soon, so it's all good for me once the barrier
>issue is solved.
>
>https://packages.debian.org/buster/libbpf-dev
>
>From the makefile's perspective it should not matter where it comes
>from - the headers should be expected to be in /usr/include and the
>library in /usr/lib* - and pkg-config can help with that if available.
>And if a user wants to use a custom path, then it's no different than
>any of the other dependencies on other external libraries

From tools/lib/bpf/Makefile, after make install_lib and make install_headers,
the headers and library would be put in /usr/local/include/bpf and /usr/local/lib*,
Is it ok?

Thanks,
Xiaolong
>
>-- 
>Kind regards,
>Luca Boccassi
  
Luca Boccassi April 3, 2019, 2:51 p.m. UTC | #18
On Wed, 2019-04-03 at 22:43 +0800, Ye Xiaolong wrote:
> On 04/03, Luca Boccassi wrote:
> [snip]
> > > Got it.
> > > 
> > > In above steps, 'libbpf' also build from kernel source tree, will
> > > it
> > > be problem
> > > in you builds to not have it build from source?
> > > 
> > > If not, taking into account that xsk.h also will be fixed, only
> > > 'tools/include/asm/barrier.h' remains the problem, and it looks
> > > like
> > > it can be
> > > solved, please check above.
> > 
> > libbpf is already packaged separately in Debian and I think other
> > distros will follow soon, so it's all good for me once the barrier
> > issue is solved.
> > 
> > https://packages.debian.org/buster/libbpf-dev
> > 
> > 
> > From the makefile's perspective it should not matter where it comes
> > from - the headers should be expected to be in /usr/include and the
> > library in /usr/lib* - and pkg-config can help with that if
> > available.
> > And if a user wants to use a custom path, then it's no different
> > than
> > any of the other dependencies on other external libraries
> 
> From tools/lib/bpf/Makefile, after make install_lib and make
> install_headers,
> the headers and library would be put in /usr/local/include/bpf and
> /usr/local/lib*,
> Is it ok?

Yes certainly that's fine, that's expected for local installations, and
users can specify a prefix with the upstream's makefile if they want to
install somewhere else.
  
Xiaolong Ye April 3, 2019, 3:14 p.m. UTC | #19
On 04/03, Luca Boccassi wrote:
>On Wed, 2019-04-03 at 22:43 +0800, Ye Xiaolong wrote:
>> On 04/03, Luca Boccassi wrote:
>> [snip]
>> > > Got it.
>> > > 
>> > > In above steps, 'libbpf' also build from kernel source tree, will
>> > > it
>> > > be problem
>> > > in you builds to not have it build from source?
>> > > 
>> > > If not, taking into account that xsk.h also will be fixed, only
>> > > 'tools/include/asm/barrier.h' remains the problem, and it looks
>> > > like
>> > > it can be
>> > > solved, please check above.
>> > 
>> > libbpf is already packaged separately in Debian and I think other
>> > distros will follow soon, so it's all good for me once the barrier
>> > issue is solved.
>> > 
>> > https://packages.debian.org/buster/libbpf-dev
>> > 
>> > 
>> > From the makefile's perspective it should not matter where it comes
>> > from - the headers should be expected to be in /usr/include and the
>> > library in /usr/lib* - and pkg-config can help with that if
>> > available.
>> > And if a user wants to use a custom path, then it's no different
>> > than
>> > any of the other dependencies on other external libraries
>> 
>> From tools/lib/bpf/Makefile, after make install_lib and make
>> install_headers,
>> the headers and library would be put in /usr/local/include/bpf and
>> /usr/local/lib*,
>> Is it ok?
>
>Yes certainly that's fine, that's expected for local installations, and
>users can specify a prefix with the upstream's makefile if they want to
>install somewhere else.

In my local test, if I run `make install_lib` to install the libbpf.so to 
/usr/local/lib64, `-lbpf` specified in af_xdp pmd still fails to find the library,
the build would end up with a lot of undefined references which are defined in libbpf.
It means during dpdk compilation, it won't search libraries in /usr/local/lib*, right?

Install the libbpf to /usr/lib64 via `make install_lib prefix=/usr` doesn't have 
this issue, so shall I just document it in af_xdp.rst or there is other proper
way to do it?

Thanks,
Xiaolong
>
>-- 
>Kind regards,
>Luca Boccassi
  
Bruce Richardson April 3, 2019, 3:23 p.m. UTC | #20
On Wed, Apr 03, 2019 at 11:14:58PM +0800, Ye Xiaolong wrote:
> On 04/03, Luca Boccassi wrote:
> >On Wed, 2019-04-03 at 22:43 +0800, Ye Xiaolong wrote:
> >> On 04/03, Luca Boccassi wrote:
> >> [snip]
> >> > > Got it.
> >> > > 
> >> > > In above steps, 'libbpf' also build from kernel source tree, will
> >> > > it
> >> > > be problem
> >> > > in you builds to not have it build from source?
> >> > > 
> >> > > If not, taking into account that xsk.h also will be fixed, only
> >> > > 'tools/include/asm/barrier.h' remains the problem, and it looks
> >> > > like
> >> > > it can be
> >> > > solved, please check above.
> >> > 
> >> > libbpf is already packaged separately in Debian and I think other
> >> > distros will follow soon, so it's all good for me once the barrier
> >> > issue is solved.
> >> > 
> >> > https://packages.debian.org/buster/libbpf-dev
> >> > 
> >> > 
> >> > From the makefile's perspective it should not matter where it comes
> >> > from - the headers should be expected to be in /usr/include and the
> >> > library in /usr/lib* - and pkg-config can help with that if
> >> > available.
> >> > And if a user wants to use a custom path, then it's no different
> >> > than
> >> > any of the other dependencies on other external libraries
> >> 
> >> From tools/lib/bpf/Makefile, after make install_lib and make
> >> install_headers,
> >> the headers and library would be put in /usr/local/include/bpf and
> >> /usr/local/lib*,
> >> Is it ok?
> >
> >Yes certainly that's fine, that's expected for local installations, and
> >users can specify a prefix with the upstream's makefile if they want to
> >install somewhere else.
> 
> In my local test, if I run `make install_lib` to install the libbpf.so to 
> /usr/local/lib64, `-lbpf` specified in af_xdp pmd still fails to find the library,
> the build would end up with a lot of undefined references which are defined in libbpf.
> It means during dpdk compilation, it won't search libraries in /usr/local/lib*, right?
> 
> Install the libbpf to /usr/lib64 via `make install_lib prefix=/usr` doesn't have 
> this issue, so shall I just document it in af_xdp.rst or there is other proper
> way to do it?
> 
At a guess I'd say you are using Fedora Linux, right? Fedora is unusual in
that it doesn't by default add /usr/local to the library and header search
paths so you need to explicitly add them to your environment. Other distros
should work fine for this.

/Bruce
  
Xiaolong Ye April 3, 2019, 3:34 p.m. UTC | #21
On 04/03, Bruce Richardson wrote:
>On Wed, Apr 03, 2019 at 11:14:58PM +0800, Ye Xiaolong wrote:
>> On 04/03, Luca Boccassi wrote:
>> >On Wed, 2019-04-03 at 22:43 +0800, Ye Xiaolong wrote:
>> >> On 04/03, Luca Boccassi wrote:
>> >> [snip]
>> >> > > Got it.
>> >> > > 
>> >> > > In above steps, 'libbpf' also build from kernel source tree, will
>> >> > > it
>> >> > > be problem
>> >> > > in you builds to not have it build from source?
>> >> > > 
>> >> > > If not, taking into account that xsk.h also will be fixed, only
>> >> > > 'tools/include/asm/barrier.h' remains the problem, and it looks
>> >> > > like
>> >> > > it can be
>> >> > > solved, please check above.
>> >> > 
>> >> > libbpf is already packaged separately in Debian and I think other
>> >> > distros will follow soon, so it's all good for me once the barrier
>> >> > issue is solved.
>> >> > 
>> >> > https://packages.debian.org/buster/libbpf-dev
>> >> > 
>> >> > 
>> >> > From the makefile's perspective it should not matter where it comes
>> >> > from - the headers should be expected to be in /usr/include and the
>> >> > library in /usr/lib* - and pkg-config can help with that if
>> >> > available.
>> >> > And if a user wants to use a custom path, then it's no different
>> >> > than
>> >> > any of the other dependencies on other external libraries
>> >> 
>> >> From tools/lib/bpf/Makefile, after make install_lib and make
>> >> install_headers,
>> >> the headers and library would be put in /usr/local/include/bpf and
>> >> /usr/local/lib*,
>> >> Is it ok?
>> >
>> >Yes certainly that's fine, that's expected for local installations, and
>> >users can specify a prefix with the upstream's makefile if they want to
>> >install somewhere else.
>> 
>> In my local test, if I run `make install_lib` to install the libbpf.so to 
>> /usr/local/lib64, `-lbpf` specified in af_xdp pmd still fails to find the library,
>> the build would end up with a lot of undefined references which are defined in libbpf.
>> It means during dpdk compilation, it won't search libraries in /usr/local/lib*, right?
>> 
>> Install the libbpf to /usr/lib64 via `make install_lib prefix=/usr` doesn't have 
>> this issue, so shall I just document it in af_xdp.rst or there is other proper
>> way to do it?
>> 
>At a guess I'd say you are using Fedora Linux, right? Fedora is unusual in
>that it doesn't by default add /usr/local to the library and header search
>paths so you need to explicitly add them to your environment. Other distros
>should work fine for this.

I am using centos 7.4, I guess it has the same issue as Fedora you mentioned 
above.

Thanks,
Xiaolong
>
>/Bruce
  
Ferruh Yigit April 3, 2019, 3:52 p.m. UTC | #22
On 4/3/2019 3:22 PM, Ye Xiaolong wrote:
> On 04/03, Ferruh Yigit wrote:
> [snip]
>>
>> It seems, 'tools/include/asm/barrier.h' is required for 'smp_wmb()' &
>> 'smp_rmb()' in 'xsk.h'.
>> We have equivalents of these in DPDK [1], and perhaps it can be possible to use
>> them and not include this header at all.
>>
>> in 'rte_eth_af_xdp.c', before including 'xsk.h', we can include an local
>> compatibility header which does following should work:
>> #define smp_rmb() rte_rmb()
>> #define smp_wmb() rte_wmb()
>>
>> @Xiaolong, what do you think?
> 
> It sounds perfect to me, I'll take it in my next version.
> Something to confirm, So we can now assume af_xdp pmd user would use kernel (say v5.1-rc4) 
> that contains fixes regarding to xsk.h and libelf, I still need to do following
> changes.
> 
> 1. I shall use <bpf/xsk.h> as xsk.h should be installed in system folders.
> 2. `-lelf` is not needed in rte.app.mk
> 3. I need to document the libbpf build and install steps in af_xdp.rst
> 4. add the above two defines before including xsk.h

Looks good to me,
only for item 4) instead of putting those defines into .c file directly, can
create a private header in driver folder, put those lines and I assume will need
a few includes for rte_rmb as well, and include that header before xsk.h.

> 
> Thanks,
> Xiaolong
> 
> 
>>
>> [1]
>> https://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/arch/x86/rte_atomic.h?h=v19.02#n30
>>
>>>
>>> The one in tools/include also is GPL-2.0 only so it cannot be included
>>> from the PMD, which is BSD-3-clause only (and it recursively includes
>>> the other arch-specific kernel headers)
>>>
>>>> Anyway, as Xiaolong mentioned, following is working, can it work from
>>>> a distro
>>>> point of view:
>>>> - get kernel source code (>= v5.1-rc1)
>>>> - build libbfp and install
>>>> - set 'RTE_KERNELDIR' to point kernel source path
>>>> - build dpdk with af_xdp enabled
>>>
>>> As long as the full kernel tree is required, we cannot enable it in
>>> Debian and Ubuntu - we can't have it at build time on the build
>>> workers, and also there's the licensing problem.
>>
>> Got it.
>>
>> In above steps, 'libbpf' also build from kernel source tree, will it be problem
>> in you builds to not have it build from source?
>>
>> If not, taking into account that xsk.h also will be fixed, only
>> 'tools/include/asm/barrier.h' remains the problem, and it looks like it can be
>> solved, please check above.
>>
>>
>>>
>>>>> Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
>>>>> userspace header so it is not covered by the userspace exception,
>>>>> which
>>>>> means at the very least the af_xdp PMD shared object is also
>>>>> licensed
>>>>> under GPL-2.0 only, isn't it?
>>>>>
>>>>
>>>>
>>
  
Xiaolong Ye April 3, 2019, 3:57 p.m. UTC | #23
On 04/03, Ferruh Yigit wrote:
>On 4/3/2019 3:22 PM, Ye Xiaolong wrote:
>> On 04/03, Ferruh Yigit wrote:
>> [snip]
>>>
>>> It seems, 'tools/include/asm/barrier.h' is required for 'smp_wmb()' &
>>> 'smp_rmb()' in 'xsk.h'.
>>> We have equivalents of these in DPDK [1], and perhaps it can be possible to use
>>> them and not include this header at all.
>>>
>>> in 'rte_eth_af_xdp.c', before including 'xsk.h', we can include an local
>>> compatibility header which does following should work:
>>> #define smp_rmb() rte_rmb()
>>> #define smp_wmb() rte_wmb()
>>>
>>> @Xiaolong, what do you think?
>> 
>> It sounds perfect to me, I'll take it in my next version.
>> Something to confirm, So we can now assume af_xdp pmd user would use kernel (say v5.1-rc4) 
>> that contains fixes regarding to xsk.h and libelf, I still need to do following
>> changes.
>> 
>> 1. I shall use <bpf/xsk.h> as xsk.h should be installed in system folders.
>> 2. `-lelf` is not needed in rte.app.mk
>> 3. I need to document the libbpf build and install steps in af_xdp.rst
>> 4. add the above two defines before including xsk.h
>
>Looks good to me,
>only for item 4) instead of putting those defines into .c file directly, can
>create a private header in driver folder, put those lines and I assume will need
>a few includes for rte_rmb as well, and include that header before xsk.h.

Sounds better, will do.

Thanks,
Xiaolong

>
>> 
>> Thanks,
>> Xiaolong
>> 
>> 
>>>
>>> [1]
>>> https://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/arch/x86/rte_atomic.h?h=v19.02#n30
>>>
>>>>
>>>> The one in tools/include also is GPL-2.0 only so it cannot be included
>>>> from the PMD, which is BSD-3-clause only (and it recursively includes
>>>> the other arch-specific kernel headers)
>>>>
>>>>> Anyway, as Xiaolong mentioned, following is working, can it work from
>>>>> a distro
>>>>> point of view:
>>>>> - get kernel source code (>= v5.1-rc1)
>>>>> - build libbfp and install
>>>>> - set 'RTE_KERNELDIR' to point kernel source path
>>>>> - build dpdk with af_xdp enabled
>>>>
>>>> As long as the full kernel tree is required, we cannot enable it in
>>>> Debian and Ubuntu - we can't have it at build time on the build
>>>> workers, and also there's the licensing problem.
>>>
>>> Got it.
>>>
>>> In above steps, 'libbpf' also build from kernel source tree, will it be problem
>>> in you builds to not have it build from source?
>>>
>>> If not, taking into account that xsk.h also will be fixed, only
>>> 'tools/include/asm/barrier.h' remains the problem, and it looks like it can be
>>> solved, please check above.
>>>
>>>
>>>>
>>>>>> Also, the license in asm/barrier.h is GPL-2.0 only. It is not a
>>>>>> userspace header so it is not covered by the userspace exception,
>>>>>> which
>>>>>> means at the very least the af_xdp PMD shared object is also
>>>>>> licensed
>>>>>> under GPL-2.0 only, isn't it?
>>>>>>
>>>>>
>>>>>
>>>
>
  
Markus Theil April 17, 2019, 12:30 p.m. UTC | #24
I tested the new af_xdp based device on the current master branch and
noticed, that the usage of static mempool names allows only for the
creation of a single af_xdp vdev. If a second vdev of the same type gets
created, the mempool allocation fails.

Best regards,
Markus Theil
  
Xiaolong Ye April 18, 2019, 1:05 a.m. UTC | #25
Hi, Markus

On 04/17, Markus Theil wrote:
>I tested the new af_xdp based device on the current master branch and
>noticed, that the usage of static mempool names allows only for the
>creation of a single af_xdp vdev. If a second vdev of the same type gets
>created, the mempool allocation fails.

Thanks for reporting, could you paste the cmdline you used and the error log?
Are you referring to ring creation or mempool creation?


Thanks,
Xiaolong
>
>Best regards,
>Markus Theil
  
Markus Theil April 23, 2019, 4:23 p.m. UTC | #26
Hi Xiaolong,

I tested your commit "net/af_xdp: fix creating multiple instance" on the
current master branch. It does not work for me in the following minimal
test setting:

1) allocate 2x 1GB huge pages for DPDK

2) ip link add p1 type veth peer name p2

3) ./dpdk-testpmd --vdev=net_af_xdp0,iface=p1
--vdev=net_af_xdp1,iface=p2 (I also tested this with two igb devices,
with the same errors)

I'm using Linux 5.1-rc6 and an up to date libbpf. The setup works for
the first device and fails for the second device when creating bpf maps
in libbpf ("qidconf_map" or "xsks_map"). It seems, that these maps also
need unique names and cannot exist twice under the same name.
Furthermore if running step 3 again after it failed for the first time,
xdp vdev allocation already fails for the first xdp vdev and does not
reach the second one. Please let me know if you need some program output
or more information from me.

Best regards,
Markus


On 4/18/19 3:05 AM, Ye Xiaolong wrote:
> Hi, Markus
>
> On 04/17, Markus Theil wrote:
>> I tested the new af_xdp based device on the current master branch and
>> noticed, that the usage of static mempool names allows only for the
>> creation of a single af_xdp vdev. If a second vdev of the same type gets
>> created, the mempool allocation fails.
> Thanks for reporting, could you paste the cmdline you used and the error log?
> Are you referring to ring creation or mempool creation?
>
>
> Thanks,
> Xiaolong
>> Best regards,
>> Markus Theil
  
Xiaolong Ye April 24, 2019, 6:35 a.m. UTC | #27
Hi, Markus

On 04/23, Markus Theil wrote:
>Hi Xiaolong,
>
>I tested your commit "net/af_xdp: fix creating multiple instance" on the
>current master branch. It does not work for me in the following minimal
>test setting:
>
>1) allocate 2x 1GB huge pages for DPDK
>
>2) ip link add p1 type veth peer name p2
>
>3) ./dpdk-testpmd --vdev=net_af_xdp0,iface=p1
>--vdev=net_af_xdp1,iface=p2 (I also tested this with two igb devices,
>with the same errors)

I've tested 19.05-rc2, started testpmd with 2 af_xdp vdev (with two i40e devices),
and it works for me.

$ ./x86_64-native-linuxapp-gcc/app/testpmd -l 5,6 -n 4 --log-level=pmd.net.af_xdp:info -b 82:00.1 --no-pci --vdev net_af_xdp0,iface=ens786f1 --vdev net_af_xdp1,iface=ens786f0
EAL: Detected 88 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Probing VFIO support...
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 3C:FD:FE:C5:E2:41
Configuring Port 1 (socket 0)
Port 1: 3C:FD:FE:C5:E2:40
Checking link statuses...
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
Logical Core 6 (socket 0) forwards packets on 2 streams:
  RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
  RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=2
  port 0: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=0 - RX free threshold=0
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=0 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0
  port 1: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=0 - RX free threshold=0
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=0 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0
Press enter to exit

Could you paste your whole failure log here?
>
>I'm using Linux 5.1-rc6 and an up to date libbpf. The setup works for
>the first device and fails for the second device when creating bpf maps
>in libbpf ("qidconf_map" or "xsks_map"). It seems, that these maps also
>need unique names and cannot exist twice under the same name.

So far as I know, there should not be such contraint, the bpf maps creations 
are wrapped in libbpf.

>Furthermore if running step 3 again after it failed for the first time,
>xdp vdev allocation already fails for the first xdp vdev and does not
>reach the second one. Please let me know if you need some program output
>or more information from me.
>
>Best regards,
>Markus
>

Thanks,
Xiaolong

>
>On 4/18/19 3:05 AM, Ye Xiaolong wrote:
>> Hi, Markus
>>
>> On 04/17, Markus Theil wrote:
>>> I tested the new af_xdp based device on the current master branch and
>>> noticed, that the usage of static mempool names allows only for the
>>> creation of a single af_xdp vdev. If a second vdev of the same type gets
>>> created, the mempool allocation fails.
>> Thanks for reporting, could you paste the cmdline you used and the error log?
>> Are you referring to ring creation or mempool creation?
>>
>>
>> Thanks,
>> Xiaolong
>>> Best regards,
>>> Markus Theil
>
  
Markus Theil April 24, 2019, 9:21 a.m. UTC | #28
Hi Xiaolong,

I also tested with i40e devices, with the same result.

./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
EAL: Detected 16 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-2048kB
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 3C:FD:FE:A3:E7:30
Configuring Port 1 (socket 0)
xsk_configure(): Failed to create xsk socket. (-1)
eth_rx_queue_setup(): Failed to configure xdp socket
Fail to configure port 1 rx queues
EAL: Error - exiting with code: 1
  Cause: Start ports failed

If I execute the same call again, I get error -16 already on the first port:

./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
EAL: Detected 16 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-2048kB
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
xsk_configure(): Failed to create xsk socket. (-16)
eth_rx_queue_setup(): Failed to configure xdp socket
Fail to configure port 0 rx queues
EAL: Error - exiting with code: 1
  Cause: Start ports failed

Software versions/commits/infos:

- Linux 5.1-rc6
- DPDK 7f251bcf22c5729792f9243480af1b3c072876a5 (19.05-rc2)
- libbpf from https://github.com/libbpf/libbpf
(910c475f09e5c269f441d7496c27dace30dc2335)
- DPDK and libbpf build with meson

Best regards,
Markus

On 4/24/19 8:35 AM, Ye Xiaolong wrote:
> Hi, Markus
>
> On 04/23, Markus Theil wrote:
>> Hi Xiaolong,
>>
>> I tested your commit "net/af_xdp: fix creating multiple instance" on the
>> current master branch. It does not work for me in the following minimal
>> test setting:
>>
>> 1) allocate 2x 1GB huge pages for DPDK
>>
>> 2) ip link add p1 type veth peer name p2
>>
>> 3) ./dpdk-testpmd --vdev=net_af_xdp0,iface=p1
>> --vdev=net_af_xdp1,iface=p2 (I also tested this with two igb devices,
>> with the same errors)
> I've tested 19.05-rc2, started testpmd with 2 af_xdp vdev (with two i40e devices),
> and it works for me.
>
> $ ./x86_64-native-linuxapp-gcc/app/testpmd -l 5,6 -n 4 --log-level=pmd.net.af_xdp:info -b 82:00.1 --no-pci --vdev net_af_xdp0,iface=ens786f1 --vdev net_af_xdp1,iface=ens786f0
> EAL: Detected 88 lcore(s)
> EAL: Detected 2 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Probing VFIO support...
> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
> Configuring Port 0 (socket 0)
> Port 0: 3C:FD:FE:C5:E2:41
> Configuring Port 1 (socket 0)
> Port 1: 3C:FD:FE:C5:E2:40
> Checking link statuses...
> Done
> No commandline core given, start packet forwarding
> io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
> Logical Core 6 (socket 0) forwards packets on 2 streams:
>   RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
>   RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
>
>   io packet forwarding packets/burst=32
>   nb forwarding cores=1 - nb forwarding ports=2
>   port 0: RX queue number: 1 Tx queue number: 1
>     Rx offloads=0x0 Tx offloads=0x0
>     RX queue: 0
>       RX desc=0 - RX free threshold=0
>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>       RX Offloads=0x0
>     TX queue: 0
>       TX desc=0 - TX free threshold=0
>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>       TX offloads=0x0 - TX RS bit threshold=0
>   port 1: RX queue number: 1 Tx queue number: 1
>     Rx offloads=0x0 Tx offloads=0x0
>     RX queue: 0
>       RX desc=0 - RX free threshold=0
>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>       RX Offloads=0x0
>     TX queue: 0
>       TX desc=0 - TX free threshold=0
>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>       TX offloads=0x0 - TX RS bit threshold=0
> Press enter to exit
>
> Could you paste your whole failure log here?
>> I'm using Linux 5.1-rc6 and an up to date libbpf. The setup works for
>> the first device and fails for the second device when creating bpf maps
>> in libbpf ("qidconf_map" or "xsks_map"). It seems, that these maps also
>> need unique names and cannot exist twice under the same name.
> So far as I know, there should not be such contraint, the bpf maps creations 
> are wrapped in libbpf.
>
>> Furthermore if running step 3 again after it failed for the first time,
>> xdp vdev allocation already fails for the first xdp vdev and does not
>> reach the second one. Please let me know if you need some program output
>> or more information from me.
>>
>> Best regards,
>> Markus
>>
> Thanks,
> Xiaolong
>
>> On 4/18/19 3:05 AM, Ye Xiaolong wrote:
>>> Hi, Markus
>>>
>>> On 04/17, Markus Theil wrote:
>>>> I tested the new af_xdp based device on the current master branch and
>>>> noticed, that the usage of static mempool names allows only for the
>>>> creation of a single af_xdp vdev. If a second vdev of the same type gets
>>>> created, the mempool allocation fails.
>>> Thanks for reporting, could you paste the cmdline you used and the error log?
>>> Are you referring to ring creation or mempool creation?
>>>
>>>
>>> Thanks,
>>> Xiaolong
>>>> Best regards,
>>>> Markus Theil
  
Xiaolong Ye April 24, 2019, 2:47 p.m. UTC | #29
Hi, Markus

On 04/24, Markus Theil wrote:
>Hi Xiaolong,
>
>I also tested with i40e devices, with the same result.
>
>./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
>net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
>EAL: Detected 16 lcore(s)
>EAL: Detected 1 NUMA nodes
>EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>EAL: No free hugepages reported in hugepages-2048kB
>EAL: No available hugepages reported in hugepages-2048kB
>EAL: Probing VFIO support...
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
>size=2176, socket=0
>testpmd: preferred mempool ops selected: ring_mp_mc
>Configuring Port 0 (socket 0)
>Port 0: 3C:FD:FE:A3:E7:30
>Configuring Port 1 (socket 0)
>xsk_configure(): Failed to create xsk socket. (-1)
>eth_rx_queue_setup(): Failed to configure xdp socket
>Fail to configure port 1 rx queues
>EAL: Error - exiting with code: 1
>  Cause: Start ports failed
>

What about one vdev instance on your side? And have you brought up the interface?
xsk_configure requires the interface to be up state.

dsd
Thanks,
Xiaolong


>If I execute the same call again, I get error -16 already on the first port:
>
>./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
>net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
>EAL: Detected 16 lcore(s)
>EAL: Detected 1 NUMA nodes
>EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>EAL: No free hugepages reported in hugepages-2048kB
>EAL: No available hugepages reported in hugepages-2048kB
>EAL: Probing VFIO support...
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
>size=2176, socket=0
>testpmd: preferred mempool ops selected: ring_mp_mc
>Configuring Port 0 (socket 0)
>xsk_configure(): Failed to create xsk socket. (-16)
>eth_rx_queue_setup(): Failed to configure xdp socket
>Fail to configure port 0 rx queues
>EAL: Error - exiting with code: 1
>  Cause: Start ports failed
>
>Software versions/commits/infos:
>
>- Linux 5.1-rc6
>- DPDK 7f251bcf22c5729792f9243480af1b3c072876a5 (19.05-rc2)
>- libbpf from https://github.com/libbpf/libbpf
>(910c475f09e5c269f441d7496c27dace30dc2335)
>- DPDK and libbpf build with meson
>
>Best regards,
>Markus
>
>On 4/24/19 8:35 AM, Ye Xiaolong wrote:
>> Hi, Markus
>>
>> On 04/23, Markus Theil wrote:
>>> Hi Xiaolong,
>>>
>>> I tested your commit "net/af_xdp: fix creating multiple instance" on the
>>> current master branch. It does not work for me in the following minimal
>>> test setting:
>>>
>>> 1) allocate 2x 1GB huge pages for DPDK
>>>
>>> 2) ip link add p1 type veth peer name p2
>>>
>>> 3) ./dpdk-testpmd --vdev=net_af_xdp0,iface=p1
>>> --vdev=net_af_xdp1,iface=p2 (I also tested this with two igb devices,
>>> with the same errors)
>> I've tested 19.05-rc2, started testpmd with 2 af_xdp vdev (with two i40e devices),
>> and it works for me.
>>
>> $ ./x86_64-native-linuxapp-gcc/app/testpmd -l 5,6 -n 4 --log-level=pmd.net.af_xdp:info -b 82:00.1 --no-pci --vdev net_af_xdp0,iface=ens786f1 --vdev net_af_xdp1,iface=ens786f0
>> EAL: Detected 88 lcore(s)
>> EAL: Detected 2 NUMA nodes
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: Probing VFIO support...
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> Port 0: 3C:FD:FE:C5:E2:41
>> Configuring Port 1 (socket 0)
>> Port 1: 3C:FD:FE:C5:E2:40
>> Checking link statuses...
>> Done
>> No commandline core given, start packet forwarding
>> io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
>> Logical Core 6 (socket 0) forwards packets on 2 streams:
>>   RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
>>   RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
>>
>>   io packet forwarding packets/burst=32
>>   nb forwarding cores=1 - nb forwarding ports=2
>>   port 0: RX queue number: 1 Tx queue number: 1
>>     Rx offloads=0x0 Tx offloads=0x0
>>     RX queue: 0
>>       RX desc=0 - RX free threshold=0
>>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       RX Offloads=0x0
>>     TX queue: 0
>>       TX desc=0 - TX free threshold=0
>>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       TX offloads=0x0 - TX RS bit threshold=0
>>   port 1: RX queue number: 1 Tx queue number: 1
>>     Rx offloads=0x0 Tx offloads=0x0
>>     RX queue: 0
>>       RX desc=0 - RX free threshold=0
>>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       RX Offloads=0x0
>>     TX queue: 0
>>       TX desc=0 - TX free threshold=0
>>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       TX offloads=0x0 - TX RS bit threshold=0
>> Press enter to exit
>>
>> Could you paste your whole failure log here?
>>> I'm using Linux 5.1-rc6 and an up to date libbpf. The setup works for
>>> the first device and fails for the second device when creating bpf maps
>>> in libbpf ("qidconf_map" or "xsks_map"). It seems, that these maps also
>>> need unique names and cannot exist twice under the same name.
>> So far as I know, there should not be such contraint, the bpf maps creations 
>> are wrapped in libbpf.
>>
>>> Furthermore if running step 3 again after it failed for the first time,
>>> xdp vdev allocation already fails for the first xdp vdev and does not
>>> reach the second one. Please let me know if you need some program output
>>> or more information from me.
>>>
>>> Best regards,
>>> Markus
>>>
>> Thanks,
>> Xiaolong
>>
>>> On 4/18/19 3:05 AM, Ye Xiaolong wrote:
>>>> Hi, Markus
>>>>
>>>> On 04/17, Markus Theil wrote:
>>>>> I tested the new af_xdp based device on the current master branch and
>>>>> noticed, that the usage of static mempool names allows only for the
>>>>> creation of a single af_xdp vdev. If a second vdev of the same type gets
>>>>> created, the mempool allocation fails.
>>>> Thanks for reporting, could you paste the cmdline you used and the error log?
>>>> Are you referring to ring creation or mempool creation?
>>>>
>>>>
>>>> Thanks,
>>>> Xiaolong
>>>>> Best regards,
>>>>> Markus Theil
  
Markus Theil April 24, 2019, 8:33 p.m. UTC | #30
Hi Xiaolong,

with only one vdev everything works. It stops working if I use two
vdevs. Both interfaces were brought up before testing.

Best regards,
Markus

On 24.04.19 16:47, Ye Xiaolong wrote:
> Hi, Markus
>
> On 04/24, Markus Theil wrote:
>> Hi Xiaolong,
>>
>> I also tested with i40e devices, with the same result.
>>
>> ./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
>> net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
>> EAL: Detected 16 lcore(s)
>> EAL: Detected 1 NUMA nodes
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: No free hugepages reported in hugepages-2048kB
>> EAL: No available hugepages reported in hugepages-2048kB
>> EAL: Probing VFIO support...
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
>> size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> Port 0: 3C:FD:FE:A3:E7:30
>> Configuring Port 1 (socket 0)
>> xsk_configure(): Failed to create xsk socket. (-1)
>> eth_rx_queue_setup(): Failed to configure xdp socket
>> Fail to configure port 1 rx queues
>> EAL: Error - exiting with code: 1
>>   Cause: Start ports failed
>>
> What about one vdev instance on your side? And have you brought up the interface?
> xsk_configure requires the interface to be up state.
>
> dsd
> Thanks,
> Xiaolong
>
>
>> If I execute the same call again, I get error -16 already on the first port:
>>
>> ./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
>> net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
>> EAL: Detected 16 lcore(s)
>> EAL: Detected 1 NUMA nodes
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: No free hugepages reported in hugepages-2048kB
>> EAL: No available hugepages reported in hugepages-2048kB
>> EAL: Probing VFIO support...
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
>> size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> xsk_configure(): Failed to create xsk socket. (-16)
>> eth_rx_queue_setup(): Failed to configure xdp socket
>> Fail to configure port 0 rx queues
>> EAL: Error - exiting with code: 1
>>   Cause: Start ports failed
>>
>> Software versions/commits/infos:
>>
>> - Linux 5.1-rc6
>> - DPDK 7f251bcf22c5729792f9243480af1b3c072876a5 (19.05-rc2)
>> - libbpf from https://github.com/libbpf/libbpf
>> (910c475f09e5c269f441d7496c27dace30dc2335)
>> - DPDK and libbpf build with meson
>>
>> Best regards,
>> Markus
>>
>> On 4/24/19 8:35 AM, Ye Xiaolong wrote:
>>> Hi, Markus
>>>
>>> On 04/23, Markus Theil wrote:
>>>> Hi Xiaolong,
>>>>
>>>> I tested your commit "net/af_xdp: fix creating multiple instance" on the
>>>> current master branch. It does not work for me in the following minimal
>>>> test setting:
>>>>
>>>> 1) allocate 2x 1GB huge pages for DPDK
>>>>
>>>> 2) ip link add p1 type veth peer name p2
>>>>
>>>> 3) ./dpdk-testpmd --vdev=net_af_xdp0,iface=p1
>>>> --vdev=net_af_xdp1,iface=p2 (I also tested this with two igb devices,
>>>> with the same errors)
>>> I've tested 19.05-rc2, started testpmd with 2 af_xdp vdev (with two i40e devices),
>>> and it works for me.
>>>
>>> $ ./x86_64-native-linuxapp-gcc/app/testpmd -l 5,6 -n 4 --log-level=pmd.net.af_xdp:info -b 82:00.1 --no-pci --vdev net_af_xdp0,iface=ens786f1 --vdev net_af_xdp1,iface=ens786f0
>>> EAL: Detected 88 lcore(s)
>>> EAL: Detected 2 NUMA nodes
>>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>>> EAL: Probing VFIO support...
>>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>>> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
>>> testpmd: preferred mempool ops selected: ring_mp_mc
>>> Configuring Port 0 (socket 0)
>>> Port 0: 3C:FD:FE:C5:E2:41
>>> Configuring Port 1 (socket 0)
>>> Port 1: 3C:FD:FE:C5:E2:40
>>> Checking link statuses...
>>> Done
>>> No commandline core given, start packet forwarding
>>> io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
>>> Logical Core 6 (socket 0) forwards packets on 2 streams:
>>>   RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
>>>   RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
>>>
>>>   io packet forwarding packets/burst=32
>>>   nb forwarding cores=1 - nb forwarding ports=2
>>>   port 0: RX queue number: 1 Tx queue number: 1
>>>     Rx offloads=0x0 Tx offloads=0x0
>>>     RX queue: 0
>>>       RX desc=0 - RX free threshold=0
>>>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>>       RX Offloads=0x0
>>>     TX queue: 0
>>>       TX desc=0 - TX free threshold=0
>>>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>>       TX offloads=0x0 - TX RS bit threshold=0
>>>   port 1: RX queue number: 1 Tx queue number: 1
>>>     Rx offloads=0x0 Tx offloads=0x0
>>>     RX queue: 0
>>>       RX desc=0 - RX free threshold=0
>>>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>>       RX Offloads=0x0
>>>     TX queue: 0
>>>       TX desc=0 - TX free threshold=0
>>>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>>       TX offloads=0x0 - TX RS bit threshold=0
>>> Press enter to exit
>>>
>>> Could you paste your whole failure log here?
>>>> I'm using Linux 5.1-rc6 and an up to date libbpf. The setup works for
>>>> the first device and fails for the second device when creating bpf maps
>>>> in libbpf ("qidconf_map" or "xsks_map"). It seems, that these maps also
>>>> need unique names and cannot exist twice under the same name.
>>> So far as I know, there should not be such contraint, the bpf maps creations 
>>> are wrapped in libbpf.
>>>
>>>> Furthermore if running step 3 again after it failed for the first time,
>>>> xdp vdev allocation already fails for the first xdp vdev and does not
>>>> reach the second one. Please let me know if you need some program output
>>>> or more information from me.
>>>>
>>>> Best regards,
>>>> Markus
>>>>
>>> Thanks,
>>> Xiaolong
>>>
>>>> On 4/18/19 3:05 AM, Ye Xiaolong wrote:
>>>>> Hi, Markus
>>>>>
>>>>> On 04/17, Markus Theil wrote:
>>>>>> I tested the new af_xdp based device on the current master branch and
>>>>>> noticed, that the usage of static mempool names allows only for the
>>>>>> creation of a single af_xdp vdev. If a second vdev of the same type gets
>>>>>> created, the mempool allocation fails.
>>>>> Thanks for reporting, could you paste the cmdline you used and the error log?
>>>>> Are you referring to ring creation or mempool creation?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Xiaolong
>>>>>> Best regards,
>>>>>> Markus Theil
  
Xiaolong Ye April 25, 2019, 5:43 a.m. UTC | #31
Hi, Markus

On 04/24, Markus Theil wrote:
>Hi Xiaolong,
>
>I also tested with i40e devices, with the same result.
>
>./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
>net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
>EAL: Detected 16 lcore(s)
>EAL: Detected 1 NUMA nodes
>EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>EAL: No free hugepages reported in hugepages-2048kB
>EAL: No available hugepages reported in hugepages-2048kB
>EAL: Probing VFIO support...
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
>size=2176, socket=0
>testpmd: preferred mempool ops selected: ring_mp_mc
>Configuring Port 0 (socket 0)
>Port 0: 3C:FD:FE:A3:E7:30
>Configuring Port 1 (socket 0)
>xsk_configure(): Failed to create xsk socket. (-1)
>eth_rx_queue_setup(): Failed to configure xdp socket
>Fail to configure port 1 rx queues
>EAL: Error - exiting with code: 1
>  Cause: Start ports failed

(-1) error should typically refer to "Operation not permitted", any special 
configuration for you interfaces and were you running it with root privilege?
and out of curiosity, why you got (-1) in your log, do you add some private
patch to print the errno?

Thanks,
Xiaolong

>
>If I execute the same call again, I get error -16 already on the first port:
>
>./dpdk-testpmd -n 4 --log-level=pmd.net.af_xdp:debug --no-pci --vdev
>net_af_xdp0,iface=enp36s0f0 --vdev net_af_xdp1,iface=enp36s0f1
>EAL: Detected 16 lcore(s)
>EAL: Detected 1 NUMA nodes
>EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>EAL: No free hugepages reported in hugepages-2048kB
>EAL: No available hugepages reported in hugepages-2048kB
>EAL: Probing VFIO support...
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=267456,
>size=2176, socket=0
>testpmd: preferred mempool ops selected: ring_mp_mc
>Configuring Port 0 (socket 0)
>xsk_configure(): Failed to create xsk socket. (-16)
>eth_rx_queue_setup(): Failed to configure xdp socket
>Fail to configure port 0 rx queues
>EAL: Error - exiting with code: 1
>  Cause: Start ports failed
>
>Software versions/commits/infos:
>
>- Linux 5.1-rc6
>- DPDK 7f251bcf22c5729792f9243480af1b3c072876a5 (19.05-rc2)
>- libbpf from https://github.com/libbpf/libbpf
>(910c475f09e5c269f441d7496c27dace30dc2335)
>- DPDK and libbpf build with meson
>
>Best regards,
>Markus
>
>On 4/24/19 8:35 AM, Ye Xiaolong wrote:
>> Hi, Markus
>>
>> On 04/23, Markus Theil wrote:
>>> Hi Xiaolong,
>>>
>>> I tested your commit "net/af_xdp: fix creating multiple instance" on the
>>> current master branch. It does not work for me in the following minimal
>>> test setting:
>>>
>>> 1) allocate 2x 1GB huge pages for DPDK
>>>
>>> 2) ip link add p1 type veth peer name p2
>>>
>>> 3) ./dpdk-testpmd --vdev=net_af_xdp0,iface=p1
>>> --vdev=net_af_xdp1,iface=p2 (I also tested this with two igb devices,
>>> with the same errors)
>> I've tested 19.05-rc2, started testpmd with 2 af_xdp vdev (with two i40e devices),
>> and it works for me.
>>
>> $ ./x86_64-native-linuxapp-gcc/app/testpmd -l 5,6 -n 4 --log-level=pmd.net.af_xdp:info -b 82:00.1 --no-pci --vdev net_af_xdp0,iface=ens786f1 --vdev net_af_xdp1,iface=ens786f0
>> EAL: Detected 88 lcore(s)
>> EAL: Detected 2 NUMA nodes
>> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> EAL: Probing VFIO support...
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp0
>> rte_pmd_af_xdp_probe(): Initializing pmd_af_xdp for net_af_xdp1
>> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=155456, size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>> Configuring Port 0 (socket 0)
>> Port 0: 3C:FD:FE:C5:E2:41
>> Configuring Port 1 (socket 0)
>> Port 1: 3C:FD:FE:C5:E2:40
>> Checking link statuses...
>> Done
>> No commandline core given, start packet forwarding
>> io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
>> Logical Core 6 (socket 0) forwards packets on 2 streams:
>>   RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
>>   RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
>>
>>   io packet forwarding packets/burst=32
>>   nb forwarding cores=1 - nb forwarding ports=2
>>   port 0: RX queue number: 1 Tx queue number: 1
>>     Rx offloads=0x0 Tx offloads=0x0
>>     RX queue: 0
>>       RX desc=0 - RX free threshold=0
>>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       RX Offloads=0x0
>>     TX queue: 0
>>       TX desc=0 - TX free threshold=0
>>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       TX offloads=0x0 - TX RS bit threshold=0
>>   port 1: RX queue number: 1 Tx queue number: 1
>>     Rx offloads=0x0 Tx offloads=0x0
>>     RX queue: 0
>>       RX desc=0 - RX free threshold=0
>>       RX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       RX Offloads=0x0
>>     TX queue: 0
>>       TX desc=0 - TX free threshold=0
>>       TX threshold registers: pthresh=0 hthresh=0  wthresh=0
>>       TX offloads=0x0 - TX RS bit threshold=0
>> Press enter to exit
>>
>> Could you paste your whole failure log here?
>>> I'm using Linux 5.1-rc6 and an up to date libbpf. The setup works for
>>> the first device and fails for the second device when creating bpf maps
>>> in libbpf ("qidconf_map" or "xsks_map"). It seems, that these maps also
>>> need unique names and cannot exist twice under the same name.
>> So far as I know, there should not be such contraint, the bpf maps creations 
>> are wrapped in libbpf.
>>
>>> Furthermore if running step 3 again after it failed for the first time,
>>> xdp vdev allocation already fails for the first xdp vdev and does not
>>> reach the second one. Please let me know if you need some program output
>>> or more information from me.
>>>
>>> Best regards,
>>> Markus
>>>
>> Thanks,
>> Xiaolong
>>
>>> On 4/18/19 3:05 AM, Ye Xiaolong wrote:
>>>> Hi, Markus
>>>>
>>>> On 04/17, Markus Theil wrote:
>>>>> I tested the new af_xdp based device on the current master branch and
>>>>> noticed, that the usage of static mempool names allows only for the
>>>>> creation of a single af_xdp vdev. If a second vdev of the same type gets
>>>>> created, the mempool allocation fails.
>>>> Thanks for reporting, could you paste the cmdline you used and the error log?
>>>> Are you referring to ring creation or mempool creation?
>>>>
>>>>
>>>> Thanks,
>>>> Xiaolong
>>>>> Best regards,
>>>>> Markus Theil
  

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index e9ff2b4c2..c13ae8215 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -479,6 +479,13 @@  M: John W. Linville <linville@tuxdriver.com>
 F: drivers/net/af_packet/
 F: doc/guides/nics/features/afpacket.ini
 
+Linux AF_XDP
+M: Xiaolong Ye <xiaolong.ye@intel.com>
+M: Qi Zhang <qi.z.zhang@intel.com>
+F: drivers/net/af_xdp/
+F: doc/guides/nics/af_xdp.rst
+F: doc/guides/nics/features/af_xdp.ini
+
 Amazon ENA
 M: Marcin Wojtas <mw@semihalf.com>
 M: Michal Krawczyk <mk@semihalf.com>
diff --git a/config/common_base b/config/common_base
index 6292bc4af..b95ee03d7 100644
--- a/config/common_base
+++ b/config/common_base
@@ -430,6 +430,11 @@  CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 #
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=n
 
+#
+# Compile software PMD backed by AF_XDP sockets (Linux only)
+#
+CONFIG_RTE_LIBRTE_PMD_AF_XDP=n
+
 #
 # Compile link bonding PMD library
 #
diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
new file mode 100644
index 000000000..af675d910
--- /dev/null
+++ b/doc/guides/nics/af_xdp.rst
@@ -0,0 +1,48 @@ 
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2019 Intel Corporation.
+
+AF_XDP Poll Mode Driver
+==========================
+
+AF_XDP is an address family that is optimized for high performance
+packet processing. AF_XDP sockets enable the possibility for XDP program to
+redirect packets to a memory buffer in userspace.
+
+For the full details behind AF_XDP socket, you can refer to
+`AF_XDP documentation in the Kernel
+<https://www.kernel.org/doc/Documentation/networking/af_xdp.rst>`_.
+
+This Linux-specific PMD driver creates the AF_XDP socket and binds it to a
+specific netdev queue, it allows a DPDK application to send and receive raw
+packets through the socket which would bypass the kernel network stack.
+Current implementation only supports single queue, multi-queues feature will
+be added later.
+
+Note that MTU of AF_XDP PMD is limited due to XDP lacks support for
+fragmentation.
+
+Options
+-------
+
+The following options can be provided to set up an af_xdp port in DPDK.
+
+*   ``iface`` - name of the Kernel interface to attach to (required);
+*   ``queue`` - netdev queue id (optional, default 0);
+
+Prerequisites
+-------------
+
+This is a Linux-specific PMD, thus the following prerequisites apply:
+
+*  A Linux Kernel (version > 4.18) with XDP sockets configuration enabled;
+*  libbpf (within kernel version > 5.1) with latest af_xdp support installed
+*  A Kernel bound interface to attach to.
+
+Set up an af_xdp interface
+-----------------------------
+
+The following example will set up an af_xdp interface in DPDK:
+
+.. code-block:: console
+
+    --vdev net_af_xdp,iface=ens786f1,queue=0
diff --git a/doc/guides/nics/features/af_xdp.ini b/doc/guides/nics/features/af_xdp.ini
new file mode 100644
index 000000000..36953c2de
--- /dev/null
+++ b/doc/guides/nics/features/af_xdp.ini
@@ -0,0 +1,11 @@ 
+;
+; Supported features of the 'af_xdp' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Link status          = Y
+MTU update           = Y
+Promiscuous mode     = Y
+Stats per queue      = Y
+x86-64               = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 5c80e3baa..a4b80a3d0 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -12,6 +12,7 @@  Network Interface Controller Drivers
     features
     build_and_test
     af_packet
+    af_xdp
     ark
     atlantic
     avp
diff --git a/doc/guides/rel_notes/release_19_05.rst b/doc/guides/rel_notes/release_19_05.rst
index bdad1ddbe..79e36739f 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -74,6 +74,13 @@  New Features
     process.
   * Added support for Rx packet types list in a secondary process.
 
+* **Added the AF_XDP PMD.**
+
+  Added a Linux-specific PMD driver for AF_XDP, it can create the AF_XDP socket
+  and bind it to a specific netdev queue, it allows a DPDK application to send
+  and receive raw packets through the socket which would bypass the kernel
+  network stack to achieve high performance packet processing.
+
 * **Updated Mellanox drivers.**
 
    New features and improvements were done in mlx4 and mlx5 PMDs:
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 502869a87..5d401b8c5 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -9,6 +9,7 @@  ifeq ($(CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD),d)
 endif
 
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_XDP) += af_xdp
 DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
 DIRS-$(CONFIG_RTE_LIBRTE_ATLANTIC_PMD) += atlantic
 DIRS-$(CONFIG_RTE_LIBRTE_AVP_PMD) += avp
diff --git a/drivers/net/af_xdp/Makefile b/drivers/net/af_xdp/Makefile
new file mode 100644
index 000000000..8343e3016
--- /dev/null
+++ b/drivers/net/af_xdp/Makefile
@@ -0,0 +1,32 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2019 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_af_xdp.a
+
+EXPORT_MAP := rte_pmd_af_xdp_version.map
+
+LIBABIVER := 1
+
+CFLAGS += -O3
+
+# require kernel version >= v5.1-rc1
+CFLAGS += -I$(RTE_KERNELDIR)/tools/include
+CFLAGS += -I$(RTE_KERNELDIR)/tools/lib/bpf
+
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
+LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
+LDLIBS += -lrte_bus_vdev
+LDLIBS += -lbpf
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_AF_XDP) += rte_eth_af_xdp.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build
new file mode 100644
index 000000000..d40aae190
--- /dev/null
+++ b/drivers/net/af_xdp/meson.build
@@ -0,0 +1,21 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2019 Intel Corporation
+
+if host_machine.system() != 'linux'
+	build = false
+endif
+
+bpf_dep = dependency('libbpf', required: false)
+if bpf_dep.found()
+	build = true
+else
+	bpf_dep = cc.find_library('libbpf', required: false)
+	if bpf_dep.found() and cc.has_header('xsk.h', dependencies: bpf_dep) and cc.has_header('linux/if_xdp.h')
+		build = true
+		pkgconfig_extra_libs += '-lbpf'
+	else
+		build = false
+	endif
+endif
+sources = files('rte_eth_af_xdp.c')
+ext_deps += bpf_dep
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
new file mode 100644
index 000000000..628b160a2
--- /dev/null
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -0,0 +1,956 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation.
+ */
+#include <unistd.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include <netinet/in.h>
+#include <net/if.h>
+#include <bpf/bpf.h>
+#include <sys/socket.h>
+#include <sys/ioctl.h>
+#include <linux/if_ether.h>
+#include <linux/if_xdp.h>
+#include <linux/if_link.h>
+#include <asm/barrier.h>
+#include <xsk.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_driver.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_bus_vdev.h>
+#include <rte_string_fns.h>
+#include <rte_branch_prediction.h>
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_eal.h>
+#include <rte_ether.h>
+#include <rte_lcore.h>
+#include <rte_log.h>
+#include <rte_memory.h>
+#include <rte_memzone.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+#include <rte_ring.h>
+
+#ifndef SOL_XDP
+#define SOL_XDP 283
+#endif
+
+#ifndef AF_XDP
+#define AF_XDP 44
+#endif
+
+#ifndef PF_XDP
+#define PF_XDP AF_XDP
+#endif
+
+static int af_xdp_logtype;
+
+#define AF_XDP_LOG(level, fmt, args...)			\
+	rte_log(RTE_LOG_ ## level, af_xdp_logtype,	\
+		"%s(): " fmt, __func__, ##args)
+
+#define ETH_AF_XDP_FRAME_SIZE		XSK_UMEM__DEFAULT_FRAME_SIZE
+#define ETH_AF_XDP_NUM_BUFFERS		4096
+#define ETH_AF_XDP_DATA_HEADROOM	0
+#define ETH_AF_XDP_DFLT_NUM_DESCS	XSK_RING_CONS__DEFAULT_NUM_DESCS
+#define ETH_AF_XDP_DFLT_QUEUE_IDX	0
+
+#define ETH_AF_XDP_RX_BATCH_SIZE	32
+#define ETH_AF_XDP_TX_BATCH_SIZE	32
+
+#define ETH_AF_XDP_MAX_QUEUE_PAIRS     16
+
+struct xsk_umem_info {
+	struct xsk_ring_prod fq;
+	struct xsk_ring_cons cq;
+	struct xsk_umem *umem;
+	struct rte_ring *buf_ring;
+	const struct rte_memzone *mz;
+};
+
+struct rx_stats {
+	uint64_t rx_pkts;
+	uint64_t rx_bytes;
+	uint64_t rx_dropped;
+};
+
+struct pkt_rx_queue {
+	struct xsk_ring_cons rx;
+	struct xsk_umem_info *umem;
+	struct xsk_socket *xsk;
+	struct rte_mempool *mb_pool;
+
+	struct rx_stats stats;
+
+	struct pkt_tx_queue *pair;
+	uint16_t queue_idx;
+};
+
+struct tx_stats {
+	uint64_t tx_pkts;
+	uint64_t err_pkts;
+	uint64_t tx_bytes;
+};
+
+struct pkt_tx_queue {
+	struct xsk_ring_prod tx;
+
+	struct tx_stats stats;
+
+	struct pkt_rx_queue *pair;
+	uint16_t queue_idx;
+};
+
+struct pmd_internals {
+	int if_index;
+	char if_name[IFNAMSIZ];
+	uint16_t queue_idx;
+	struct ether_addr eth_addr;
+	struct xsk_umem_info *umem;
+	struct rte_mempool *mb_pool_share;
+
+	struct pkt_rx_queue rx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS];
+	struct pkt_tx_queue tx_queues[ETH_AF_XDP_MAX_QUEUE_PAIRS];
+};
+
+#define ETH_AF_XDP_IFACE_ARG			"iface"
+#define ETH_AF_XDP_QUEUE_IDX_ARG		"queue"
+
+static const char * const valid_arguments[] = {
+	ETH_AF_XDP_IFACE_ARG,
+	ETH_AF_XDP_QUEUE_IDX_ARG,
+	NULL
+};
+
+static const struct rte_eth_link pmd_link = {
+	.link_speed = ETH_SPEED_NUM_10G,
+	.link_duplex = ETH_LINK_FULL_DUPLEX,
+	.link_status = ETH_LINK_DOWN,
+	.link_autoneg = ETH_LINK_AUTONEG
+};
+
+static inline int
+reserve_fill_queue(struct xsk_umem_info *umem, int reserve_size)
+{
+	struct xsk_ring_prod *fq = &umem->fq;
+	uint32_t idx;
+	int i, ret;
+
+	ret = xsk_ring_prod__reserve(fq, reserve_size, &idx);
+	if (unlikely(!ret)) {
+		AF_XDP_LOG(ERR, "Failed to reserve enough fq descs.\n");
+		return ret;
+	}
+
+	for (i = 0; i < reserve_size; i++) {
+		__u64 *fq_addr;
+		void *addr = NULL;
+		if (rte_ring_dequeue(umem->buf_ring, &addr)) {
+			i--;
+			break;
+		}
+		fq_addr = xsk_ring_prod__fill_addr(fq, idx++);
+		*fq_addr = (uint64_t)addr;
+	}
+
+	xsk_ring_prod__submit(fq, i);
+
+	return 0;
+}
+
+static uint16_t
+eth_af_xdp_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct pkt_rx_queue *rxq = queue;
+	struct xsk_ring_cons *rx = &rxq->rx;
+	struct xsk_umem_info *umem = rxq->umem;
+	struct xsk_ring_prod *fq = &umem->fq;
+	uint32_t idx_rx;
+	uint32_t free_thresh = fq->size >> 1;
+	struct rte_mbuf *mbufs[ETH_AF_XDP_TX_BATCH_SIZE];
+	unsigned long dropped = 0;
+	unsigned long rx_bytes = 0;
+	uint16_t count = 0;
+	int rcvd, i;
+
+	nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE);
+
+	rcvd = xsk_ring_cons__peek(rx, nb_pkts, &idx_rx);
+	if (rcvd == 0)
+		return 0;
+
+	if (xsk_prod_nb_free(fq, free_thresh) >= free_thresh)
+		(void)reserve_fill_queue(umem, ETH_AF_XDP_RX_BATCH_SIZE);
+
+	if (unlikely(rte_pktmbuf_alloc_bulk(rxq->mb_pool, mbufs, rcvd) != 0))
+		return 0;
+
+	for (i = 0; i < rcvd; i++) {
+		const struct xdp_desc *desc;
+		uint64_t addr;
+		uint32_t len;
+		void *pkt;
+
+		desc = xsk_ring_cons__rx_desc(rx, idx_rx++);
+		addr = desc->addr;
+		len = desc->len;
+		pkt = xsk_umem__get_data(rxq->umem->mz->addr, addr);
+
+		rte_memcpy(rte_pktmbuf_mtod(mbufs[i], void *), pkt, len);
+		rte_pktmbuf_pkt_len(mbufs[i]) = len;
+		rte_pktmbuf_data_len(mbufs[i]) = len;
+		rx_bytes += len;
+		bufs[count++] = mbufs[i];
+
+		rte_ring_enqueue(umem->buf_ring, (void *)addr);
+	}
+
+	xsk_ring_cons__release(rx, rcvd);
+
+	/* statistics */
+	rxq->stats.rx_pkts += (rcvd - dropped);
+	rxq->stats.rx_bytes += rx_bytes;
+
+	return count;
+}
+
+static void
+pull_umem_cq(struct xsk_umem_info *umem, int size)
+{
+	struct xsk_ring_cons *cq = &umem->cq;
+	size_t i, n;
+	uint32_t idx_cq;
+
+	n = xsk_ring_cons__peek(cq, size, &idx_cq);
+
+	for (i = 0; i < n; i++) {
+		uint64_t addr;
+		addr = *xsk_ring_cons__comp_addr(cq, idx_cq++);
+		rte_ring_enqueue(umem->buf_ring, (void *)addr);
+	}
+
+	xsk_ring_cons__release(cq, n);
+}
+
+static void
+kick_tx(struct pkt_tx_queue *txq)
+{
+	struct xsk_umem_info *umem = txq->pair->umem;
+
+	while (send(xsk_socket__fd(txq->pair->xsk), NULL,
+		      0, MSG_DONTWAIT) < 0) {
+		/* some thing unexpected */
+		if (errno != EBUSY && errno != EAGAIN && errno != EINTR)
+			break;
+
+		/* pull from complete qeueu to leave more space */
+		if (errno == EAGAIN)
+			pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
+	}
+	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
+}
+
+static uint16_t
+eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	struct pkt_tx_queue *txq = queue;
+	struct xsk_umem_info *umem = txq->pair->umem;
+	struct rte_mbuf *mbuf;
+	void *addrs[ETH_AF_XDP_TX_BATCH_SIZE];
+	unsigned long tx_bytes = 0;
+	int i, valid = 0;
+	uint32_t idx_tx;
+
+	nb_pkts = RTE_MIN(nb_pkts, ETH_AF_XDP_TX_BATCH_SIZE);
+
+	pull_umem_cq(umem, nb_pkts);
+
+	nb_pkts = rte_ring_dequeue_bulk(umem->buf_ring, addrs,
+					nb_pkts, NULL);
+	if (nb_pkts == 0)
+		return 0;
+
+	if (xsk_ring_prod__reserve(&txq->tx, nb_pkts, &idx_tx) != nb_pkts) {
+		kick_tx(txq);
+		return 0;
+	}
+
+	for (i = 0; i < nb_pkts; i++) {
+		struct xdp_desc *desc;
+		void *pkt;
+		uint32_t buf_len = ETH_AF_XDP_FRAME_SIZE
+					- ETH_AF_XDP_DATA_HEADROOM;
+		desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx + i);
+		mbuf = bufs[i];
+		if (mbuf->pkt_len <= buf_len) {
+			desc->addr = (uint64_t)addrs[valid];
+			desc->len = mbuf->pkt_len;
+			pkt = xsk_umem__get_data(umem->mz->addr,
+						 desc->addr);
+			rte_memcpy(pkt, rte_pktmbuf_mtod(mbuf, void *),
+			       desc->len);
+			valid++;
+			tx_bytes += mbuf->pkt_len;
+		}
+		rte_pktmbuf_free(mbuf);
+	}
+
+	xsk_ring_prod__submit(&txq->tx, nb_pkts);
+
+	kick_tx(txq);
+
+	if (valid < nb_pkts)
+		rte_ring_enqueue_bulk(umem->buf_ring, &addrs[valid],
+				 nb_pkts - valid, NULL);
+
+	txq->stats.err_pkts += nb_pkts - valid;
+	txq->stats.tx_pkts += valid;
+	txq->stats.tx_bytes += tx_bytes;
+
+	return nb_pkts;
+}
+
+static int
+eth_dev_start(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	return 0;
+}
+
+/* This function gets called when the current port gets stopped. */
+static void
+eth_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static int
+eth_dev_configure(struct rte_eth_dev *dev)
+{
+	/* rx/tx must be paired */
+	if (dev->data->nb_rx_queues != dev->data->nb_tx_queues)
+		return -EINVAL;
+
+	return 0;
+}
+
+static void
+eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	dev_info->if_index = internals->if_index;
+	dev_info->max_mac_addrs = 1;
+	dev_info->max_rx_pktlen = ETH_FRAME_LEN;
+	dev_info->max_rx_queues = 1;
+	dev_info->max_tx_queues = 1;
+
+	dev_info->default_rxportconf.nb_queues = 1;
+	dev_info->default_txportconf.nb_queues = 1;
+	dev_info->default_rxportconf.ring_size = ETH_AF_XDP_DFLT_NUM_DESCS;
+	dev_info->default_txportconf.ring_size = ETH_AF_XDP_DFLT_NUM_DESCS;
+}
+
+static int
+eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct xdp_statistics xdp_stats;
+	struct pkt_rx_queue *rxq;
+	socklen_t optlen;
+	int i, ret;
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		optlen = sizeof(struct xdp_statistics);
+		rxq = &internals->rx_queues[i];
+		stats->q_ipackets[i] = internals->rx_queues[i].stats.rx_pkts;
+		stats->q_ibytes[i] = internals->rx_queues[i].stats.rx_bytes;
+
+		stats->q_opackets[i] = internals->tx_queues[i].stats.tx_pkts;
+		stats->q_obytes[i] = internals->tx_queues[i].stats.tx_bytes;
+
+		stats->ipackets += stats->q_ipackets[i];
+		stats->ibytes += stats->q_ibytes[i];
+		stats->imissed += internals->rx_queues[i].stats.rx_dropped;
+		ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP,
+				XDP_STATISTICS, &xdp_stats, &optlen);
+		if (ret != 0) {
+			AF_XDP_LOG(ERR, "getsockopt() failed for XDP_STATISTICS.\n");
+			return -1;
+		}
+		stats->imissed += xdp_stats.rx_dropped;
+
+		stats->opackets += stats->q_opackets[i];
+		stats->oerrors += internals->tx_queues[i].stats.err_pkts;
+		stats->obytes += stats->q_obytes[i];
+	}
+
+	return 0;
+}
+
+static void
+eth_stats_reset(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	int i;
+
+	for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) {
+		memset(&internals->rx_queues[i].stats, 0,
+					sizeof(struct rx_stats));
+		memset(&internals->tx_queues[i].stats, 0,
+					sizeof(struct tx_stats));
+	}
+}
+
+static void
+remove_xdp_program(struct pmd_internals *internals)
+{
+	uint32_t curr_prog_id = 0;
+
+	if (bpf_get_link_xdp_id(internals->if_index, &curr_prog_id,
+				XDP_FLAGS_UPDATE_IF_NOEXIST)) {
+		AF_XDP_LOG(ERR, "bpf_get_link_xdp_id failed\n");
+		return;
+	}
+	bpf_set_link_xdp_fd(internals->if_index, -1,
+			XDP_FLAGS_UPDATE_IF_NOEXIST);
+}
+
+static void
+eth_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct pkt_rx_queue *rxq;
+	int i;
+
+	AF_XDP_LOG(INFO, "Closing AF_XDP ethdev on numa socket %u\n",
+		rte_socket_id());
+
+	for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) {
+		rxq = &internals->rx_queues[i];
+		if (rxq->umem == NULL)
+			break;
+		xsk_socket__delete(rxq->xsk);
+	}
+
+	(void)xsk_umem__delete(internals->umem->umem);
+	remove_xdp_program(internals);
+}
+
+static void
+eth_queue_release(void *q __rte_unused)
+{
+}
+
+static int
+eth_link_update(struct rte_eth_dev *dev __rte_unused,
+		int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static void
+xdp_umem_destroy(struct xsk_umem_info *umem)
+{
+	rte_memzone_free(umem->mz);
+	umem->mz = NULL;
+
+	rte_ring_free(umem->buf_ring);
+	umem->buf_ring = NULL;
+
+	rte_free(umem);
+	umem = NULL;
+}
+
+static struct
+xsk_umem_info *xdp_umem_configure(void)
+{
+	struct xsk_umem_info *umem;
+	const struct rte_memzone *mz;
+	struct xsk_umem_config usr_config = {
+		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
+		.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
+		.frame_size = ETH_AF_XDP_FRAME_SIZE,
+		.frame_headroom = ETH_AF_XDP_DATA_HEADROOM };
+	int ret;
+	uint64_t i;
+
+	umem = rte_zmalloc_socket("umem", sizeof(*umem), 0, rte_socket_id());
+	if (umem == NULL) {
+		AF_XDP_LOG(ERR, "Failed to allocate umem info");
+		return NULL;
+	}
+
+	umem->buf_ring = rte_ring_create("af_xdp_ring",
+					 ETH_AF_XDP_NUM_BUFFERS,
+					 rte_socket_id(),
+					 0x0);
+	if (umem->buf_ring == NULL) {
+		AF_XDP_LOG(ERR, "Failed to create rte_ring\n");
+		goto err;
+	}
+
+	for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++)
+		rte_ring_enqueue(umem->buf_ring,
+				 (void *)(i * ETH_AF_XDP_FRAME_SIZE +
+					  ETH_AF_XDP_DATA_HEADROOM));
+
+	mz = rte_memzone_reserve_aligned("af_xdp uemem",
+			ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE,
+			rte_socket_id(), RTE_MEMZONE_IOVA_CONTIG,
+			getpagesize());
+	if (mz == NULL) {
+		AF_XDP_LOG(ERR, "Failed to reserve memzone for af_xdp umem.\n");
+		goto err;
+	}
+
+	ret = xsk_umem__create(&umem->umem, mz->addr,
+			       ETH_AF_XDP_NUM_BUFFERS * ETH_AF_XDP_FRAME_SIZE,
+			       &umem->fq, &umem->cq,
+			       &usr_config);
+
+	if (ret) {
+		AF_XDP_LOG(ERR, "Failed to create umem");
+		goto err;
+	}
+	umem->mz = mz;
+
+	return umem;
+
+err:
+	xdp_umem_destroy(umem);
+	return NULL;
+}
+
+static int
+xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
+	      int ring_size)
+{
+	struct xsk_socket_config cfg;
+	struct pkt_tx_queue *txq = rxq->pair;
+	int ret = 0;
+	int reserve_size;
+
+	rxq->umem = xdp_umem_configure();
+	if (rxq->umem == NULL)
+		return -ENOMEM;
+
+	cfg.rx_size = ring_size;
+	cfg.tx_size = ring_size;
+	cfg.libbpf_flags = 0;
+	cfg.xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
+	cfg.bind_flags = 0;
+	ret = xsk_socket__create(&rxq->xsk, internals->if_name,
+			internals->queue_idx, rxq->umem->umem, &rxq->rx,
+			&txq->tx, &cfg);
+	if (ret) {
+		AF_XDP_LOG(ERR, "Failed to create xsk socket.\n");
+		goto err;
+	}
+
+	reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2;
+	ret = reserve_fill_queue(rxq->umem, reserve_size);
+	if (ret) {
+		xsk_socket__delete(rxq->xsk);
+		AF_XDP_LOG(ERR, "Failed to reserve fill queue.\n");
+		goto err;
+	}
+
+	return 0;
+
+err:
+	xdp_umem_destroy(rxq->umem);
+
+	return ret;
+}
+
+static void
+queue_reset(struct pmd_internals *internals, uint16_t queue_idx)
+{
+	struct pkt_rx_queue *rxq = &internals->rx_queues[queue_idx];
+	struct pkt_tx_queue *txq = rxq->pair;
+
+	memset(rxq, 0, sizeof(*rxq));
+	memset(txq, 0, sizeof(*txq));
+	rxq->pair = txq;
+	txq->pair = rxq;
+	rxq->queue_idx = queue_idx;
+	txq->queue_idx = queue_idx;
+}
+
+static int
+eth_rx_queue_setup(struct rte_eth_dev *dev,
+		   uint16_t rx_queue_id,
+		   uint16_t nb_rx_desc,
+		   unsigned int socket_id __rte_unused,
+		   const struct rte_eth_rxconf *rx_conf __rte_unused,
+		   struct rte_mempool *mb_pool)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	uint32_t buf_size, data_size;
+	struct pkt_rx_queue *rxq;
+	int ret;
+
+	rxq = &internals->rx_queues[rx_queue_id];
+	queue_reset(internals, rx_queue_id);
+
+	/* Now get the space available for data in the mbuf */
+	buf_size = rte_pktmbuf_data_room_size(mb_pool) -
+		RTE_PKTMBUF_HEADROOM;
+	data_size = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM;
+
+	if (data_size > buf_size) {
+		AF_XDP_LOG(ERR, "%s: %d bytes will not fit in mbuf (%d bytes)\n",
+			dev->device->name, data_size, buf_size);
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	rxq->mb_pool = mb_pool;
+
+	if (xsk_configure(internals, rxq, nb_rx_desc)) {
+		AF_XDP_LOG(ERR, "Failed to configure xdp socket\n");
+		ret = -EINVAL;
+		goto err;
+	}
+
+	internals->umem = rxq->umem;
+
+	dev->data->rx_queues[rx_queue_id] = rxq;
+	return 0;
+
+err:
+	queue_reset(internals, rx_queue_id);
+	return ret;
+}
+
+static int
+eth_tx_queue_setup(struct rte_eth_dev *dev,
+		   uint16_t tx_queue_id,
+		   uint16_t nb_tx_desc __rte_unused,
+		   unsigned int socket_id __rte_unused,
+		   const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct pkt_tx_queue *txq;
+
+	txq = &internals->tx_queues[tx_queue_id];
+
+	dev->data->tx_queues[tx_queue_id] = txq;
+	return 0;
+}
+
+static int
+eth_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+	struct ifreq ifr = { .ifr_mtu = mtu };
+	int ret;
+	int s;
+
+	s = socket(PF_INET, SOCK_DGRAM, 0);
+	if (s < 0)
+		return -EINVAL;
+
+	strlcpy(ifr.ifr_name, internals->if_name, IFNAMSIZ);
+	ret = ioctl(s, SIOCSIFMTU, &ifr);
+	close(s);
+
+	return (ret < 0) ? -errno : 0;
+}
+
+static void
+eth_dev_change_flags(char *if_name, uint32_t flags, uint32_t mask)
+{
+	struct ifreq ifr;
+	int s;
+
+	s = socket(PF_INET, SOCK_DGRAM, 0);
+	if (s < 0)
+		return;
+
+	strlcpy(ifr.ifr_name, if_name, IFNAMSIZ);
+	if (ioctl(s, SIOCGIFFLAGS, &ifr) < 0)
+		goto out;
+	ifr.ifr_flags &= mask;
+	ifr.ifr_flags |= flags;
+	if (ioctl(s, SIOCSIFFLAGS, &ifr) < 0)
+		goto out;
+out:
+	close(s);
+}
+
+static void
+eth_dev_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	eth_dev_change_flags(internals->if_name, IFF_PROMISC, ~0);
+}
+
+static void
+eth_dev_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *internals = dev->data->dev_private;
+
+	eth_dev_change_flags(internals->if_name, 0, ~IFF_PROMISC);
+}
+
+static const struct eth_dev_ops ops = {
+	.dev_start = eth_dev_start,
+	.dev_stop = eth_dev_stop,
+	.dev_close = eth_dev_close,
+	.dev_configure = eth_dev_configure,
+	.dev_infos_get = eth_dev_info,
+	.mtu_set = eth_dev_mtu_set,
+	.promiscuous_enable = eth_dev_promiscuous_enable,
+	.promiscuous_disable = eth_dev_promiscuous_disable,
+	.rx_queue_setup = eth_rx_queue_setup,
+	.tx_queue_setup = eth_tx_queue_setup,
+	.rx_queue_release = eth_queue_release,
+	.tx_queue_release = eth_queue_release,
+	.link_update = eth_link_update,
+	.stats_get = eth_stats_get,
+	.stats_reset = eth_stats_reset,
+};
+
+/** parse integer from integer argument */
+static int
+parse_integer_arg(const char *key __rte_unused,
+		  const char *value, void *extra_args)
+{
+	int *i = (int *)extra_args;
+	char *end;
+
+	*i = strtol(value, &end, 10);
+	if (*i < 0) {
+		AF_XDP_LOG(ERR, "Argument has to be positive.\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/** parse name argument */
+static int
+parse_name_arg(const char *key __rte_unused,
+	       const char *value, void *extra_args)
+{
+	char *name = extra_args;
+
+	if (strnlen(value, IFNAMSIZ) > IFNAMSIZ - 1) {
+		AF_XDP_LOG(ERR, "Invalid name %s, should be less than %u bytes.\n",
+			   value, IFNAMSIZ);
+		return -EINVAL;
+	}
+
+	strlcpy(name, value, IFNAMSIZ);
+
+	return 0;
+}
+
+static int
+parse_parameters(struct rte_kvargs *kvlist,
+		 char *if_name,
+		 int *queue_idx)
+{
+	int ret;
+
+	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_IFACE_ARG,
+				 &parse_name_arg, if_name);
+	if (ret < 0)
+		goto free_kvlist;
+
+	ret = rte_kvargs_process(kvlist, ETH_AF_XDP_QUEUE_IDX_ARG,
+				 &parse_integer_arg, queue_idx);
+	if (ret < 0)
+		goto free_kvlist;
+
+free_kvlist:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+get_iface_info(const char *if_name,
+	       struct ether_addr *eth_addr,
+	       int *if_index)
+{
+	struct ifreq ifr;
+	int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP);
+
+	if (sock < 0)
+		return -1;
+
+	strlcpy(ifr.ifr_name, if_name, IFNAMSIZ);
+	if (ioctl(sock, SIOCGIFINDEX, &ifr))
+		goto error;
+
+	*if_index = ifr.ifr_ifindex;
+
+	if (ioctl(sock, SIOCGIFHWADDR, &ifr))
+		goto error;
+
+	rte_memcpy(eth_addr, ifr.ifr_hwaddr.sa_data, ETHER_ADDR_LEN);
+
+	close(sock);
+	return 0;
+
+error:
+	close(sock);
+	return -1;
+}
+
+static struct rte_eth_dev *
+init_internals(struct rte_vdev_device *dev,
+	       const char *if_name,
+	       int queue_idx)
+{
+	const char *name = rte_vdev_device_name(dev);
+	const unsigned int numa_node = dev->device.numa_node;
+	struct pmd_internals *internals;
+	struct rte_eth_dev *eth_dev;
+	int ret;
+	int i;
+
+	internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node);
+	if (internals == NULL)
+		return NULL;
+
+	internals->queue_idx = queue_idx;
+	strlcpy(internals->if_name, if_name, IFNAMSIZ);
+
+	for (i = 0; i < ETH_AF_XDP_MAX_QUEUE_PAIRS; i++) {
+		internals->tx_queues[i].pair = &internals->rx_queues[i];
+		internals->rx_queues[i].pair = &internals->tx_queues[i];
+	}
+
+	ret = get_iface_info(if_name, &internals->eth_addr,
+			     &internals->if_index);
+	if (ret)
+		goto err;
+
+	eth_dev = rte_eth_vdev_allocate(dev, 0);
+	if (eth_dev == NULL)
+		goto err;
+
+	eth_dev->data->dev_private = internals;
+	eth_dev->data->dev_link = pmd_link;
+	eth_dev->data->mac_addrs = &internals->eth_addr;
+	eth_dev->dev_ops = &ops;
+	eth_dev->rx_pkt_burst = eth_af_xdp_rx;
+	eth_dev->tx_pkt_burst = eth_af_xdp_tx;
+
+	return eth_dev;
+
+err:
+	rte_free(internals);
+	return NULL;
+}
+
+static int
+rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
+{
+	struct rte_kvargs *kvlist;
+	char if_name[IFNAMSIZ] = {'\0'};
+	int xsk_queue_idx = ETH_AF_XDP_DFLT_QUEUE_IDX;
+	struct rte_eth_dev *eth_dev = NULL;
+	const char *name;
+
+	AF_XDP_LOG(INFO, "Initializing pmd_af_xdp for %s\n",
+		rte_vdev_device_name(dev));
+
+	name = rte_vdev_device_name(dev);
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+		strlen(rte_vdev_device_args(dev)) == 0) {
+		eth_dev = rte_eth_dev_attach_secondary(name);
+		if (eth_dev == NULL) {
+			AF_XDP_LOG(ERR, "Failed to probe %s\n", name);
+			return -EINVAL;
+		}
+		eth_dev->dev_ops = &ops;
+		rte_eth_dev_probing_finish(eth_dev);
+		return 0;
+	}
+
+	kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_arguments);
+	if (kvlist == NULL) {
+		AF_XDP_LOG(ERR, "Invalid kvargs key\n");
+		return -EINVAL;
+	}
+
+	if (dev->device.numa_node == SOCKET_ID_ANY)
+		dev->device.numa_node = rte_socket_id();
+
+	if (parse_parameters(kvlist, if_name, &xsk_queue_idx) < 0) {
+		AF_XDP_LOG(ERR, "Invalid kvargs value\n");
+		return -EINVAL;
+	}
+
+	if (strlen(if_name) == 0) {
+		AF_XDP_LOG(ERR, "Network interface must be specified\n");
+		return -EINVAL;
+	}
+
+	eth_dev = init_internals(dev, if_name, xsk_queue_idx);
+	if (eth_dev == NULL) {
+		AF_XDP_LOG(ERR, "Failed to init internals\n");
+		return -1;
+	}
+
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
+}
+
+static int
+rte_pmd_af_xdp_remove(struct rte_vdev_device *dev)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+	struct pmd_internals *internals;
+
+	AF_XDP_LOG(INFO, "Removing AF_XDP ethdev on numa socket %u\n",
+		rte_socket_id());
+
+	if (dev == NULL)
+		return -1;
+
+	/* find the ethdev entry */
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+	if (eth_dev == NULL)
+		return -1;
+
+	internals = eth_dev->data->dev_private;
+
+	rte_ring_free(internals->umem->buf_ring);
+	rte_memzone_free(internals->umem->mz);
+	rte_free(internals->umem);
+
+	rte_eth_dev_release_port(eth_dev);
+
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_af_xdp_drv = {
+	.probe = rte_pmd_af_xdp_probe,
+	.remove = rte_pmd_af_xdp_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_af_xdp, pmd_af_xdp_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_af_xdp,
+			      "iface=<string> "
+			      "queue=<int> ");
+
+RTE_INIT(af_xdp_init_log)
+{
+	af_xdp_logtype = rte_log_register("pmd.net.af_xdp");
+	if (af_xdp_logtype >= 0)
+		rte_log_set_level(af_xdp_logtype, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/af_xdp/rte_pmd_af_xdp_version.map b/drivers/net/af_xdp/rte_pmd_af_xdp_version.map
new file mode 100644
index 000000000..c6db030fe
--- /dev/null
+++ b/drivers/net/af_xdp/rte_pmd_af_xdp_version.map
@@ -0,0 +1,3 @@ 
+DPDK_19.05 {
+	local: *;
+};
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 3ecc78cee..1105e72d8 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -2,6 +2,7 @@ 
 # Copyright(c) 2017 Intel Corporation
 
 drivers = ['af_packet',
+	'af_xdp',
 	'ark',
 	'atlantic',
 	'avp',
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 262132fc6..f916bc9ef 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -143,6 +143,7 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL)  += -lrte_mempool_dpaa2
 endif
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET)  += -lrte_pmd_af_packet
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_XDP)     += -lrte_pmd_af_xdp -lbpf
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ARK_PMD)        += -lrte_pmd_ark
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ATLANTIC_PMD)   += -lrte_pmd_atlantic
 _LDLIBS-$(CONFIG_RTE_LIBRTE_AVP_PMD)        += -lrte_pmd_avp