[dpdk-dev,v1,11/16] ethdev: refine TPID handling in flow API

Message ID 20180404150312.12304-12-adrien.mazarguil@6wind.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation fail apply issues

Commit Message

Adrien Mazarguil April 4, 2018, 3:56 p.m. UTC
  TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.

Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:

 Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
 rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]

Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:

 Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
 rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
 rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]

While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.

This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.

Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:

- bnxt: EtherType matching is supported, and vlan->inner_type overrides
  eth->type if the latter has standard TPID value 0x8100, otherwise an
  error is triggered.

- e1000: EtherType matching is only supported with the ETHERTYPE filter,
  which does not support VLAN matching, therefore no impact.

- enic: same as bnxt.

- i40e: same as bnxt with a configurable TPID value for the FDIR filter,
  with existing limitations on allowed EtherType values. The remaining
  filter types (VXLAN, NVGRE, QINQ) do not support EtherType matching.

- ixgbe: same as e1000.

- mlx4: EtherType/TPID matching is not supported, no impact.

- mlx5: same as bnxt.

- mrvl: EtherType matching is supported but eth->type cannot be specified
  when a VLAN item is present. However vlan->inner_type is used if
  specified.

- sfc: same as bnxt with QinQ TPID value 0x88a8 additionally supported.

- tap: same as bnxt.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Cc: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Thomas Monjalon <thomas@monjalon.net>
Cc: Wenzhuo Lu <wenzhuo.lu@intel.com>
Cc: Jingjing Wu <jingjing.wu@intel.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Cc: John Daley <johndale@cisco.com>
Cc: Hyong Youb Kim <hyonkim@cisco.com>
Cc: Beilei Xing <beilei.xing@intel.com>
Cc: Qi Zhang <qi.z.zhang@intel.com>
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
Cc: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Cc: Yongseok Koh <yskoh@mellanox.com>
Cc: Jacek Siuda <jck@semihalf.com>
Cc: Tomasz Duszynski <tdu@semihalf.com>
Cc: Dmitri Epshtein <dima@marvell.com>
Cc: Natalie Samsonov <nsamsono@marvell.com>
Cc: Jianbo Liu <jianbo.liu@arm.com>
Cc: Andrew Rybchenko <arybchenko@solarflare.com>
Cc: Pascal Mazon <pascal.mazon@6wind.com>

---

Hi PMD maintainers, while I'm pretty confident in these changes, I could
not validate them with all devices.

It would be great if you could apply this patch, run testpmd, create VLAN
flow rules with/without inner EtherType as described and send matching
traffic while making sure nothing was broken in the process.

Thanks!
---
 app/test-pmd/cmdline_flow.c                 | 17 +++---
 doc/guides/nics/tap.rst                     |  2 +-
 doc/guides/prog_guide/rte_flow.rst          | 21 +++++--
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +-
 drivers/net/bnxt/bnxt_filter.c              | 38 ++++++++++--
 drivers/net/enic/enic_flow.c                | 21 ++++---
 drivers/net/i40e/i40e_flow.c                | 74 ++++++++++++++++++++----
 drivers/net/mlx5/mlx5_flow.c                | 14 ++++-
 drivers/net/mvpp2/mrvl_flow.c               | 27 +++++++--
 drivers/net/sfc/sfc_flow.c                  | 27 +++++++++
 drivers/net/tap/tap_flow.c                  | 16 +++--
 lib/librte_ether/rte_flow.h                 | 24 +++++---
 12 files changed, 227 insertions(+), 58 deletions(-)
  

Comments

Nélio Laranjeiro April 5, 2018, 9:55 a.m. UTC | #1
On Wed, Apr 04, 2018 at 05:56:49PM +0200, Adrien Mazarguil wrote:
> TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
> consistent with the normal stacking order of pattern items, which is
> confusing to applications.
> 
> Problem is that when followed by one of these layers, the EtherType field
> of the preceding layer keeps its "inner" definition, and the "outer" TPID
> is provided by the subsequent layer, the reverse of how a packet looks like
> on the wire:
> 
>  Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
>  rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]
> 
> Worse, when QinQ is involved, the stacking order of VLAN layers is
> unspecified. It is unclear whether it should be reversed (innermost to
> outermost) as well given TPID applies to the previous layer:
> 
>  Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
>  rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
>  rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]
> 
> While specifying EtherType/TPID is hopefully rarely necessary, the stacking
> order in case of QinQ and the lack of documentation remain an issue.
> 
> This patch replaces TPID in the VLAN pattern item with an inner
> EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
> clarifies documentation and updates all relevant code.
> 
> Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
> items:
> 
> - bnxt: EtherType matching is supported, and vlan->inner_type overrides
>   eth->type if the latter has standard TPID value 0x8100, otherwise an
>   error is triggered.
> 
> - e1000: EtherType matching is only supported with the ETHERTYPE filter,
>   which does not support VLAN matching, therefore no impact.
> 
> - enic: same as bnxt.
> 
> - i40e: same as bnxt with a configurable TPID value for the FDIR filter,
>   with existing limitations on allowed EtherType values. The remaining
>   filter types (VXLAN, NVGRE, QINQ) do not support EtherType matching.
> 
> - ixgbe: same as e1000.
> 
> - mlx4: EtherType/TPID matching is not supported, no impact.
> 
> - mlx5: same as bnxt.
> 
> - mrvl: EtherType matching is supported but eth->type cannot be specified
>   when a VLAN item is present. However vlan->inner_type is used if
>   specified.
> 
> - sfc: same as bnxt with QinQ TPID value 0x88a8 additionally supported.
> 
> - tap: same as bnxt.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: Ferruh Yigit <ferruh.yigit@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>
> Cc: Wenzhuo Lu <wenzhuo.lu@intel.com>
> Cc: Jingjing Wu <jingjing.wu@intel.com>
> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Cc: Somnath Kotur <somnath.kotur@broadcom.com>
> Cc: John Daley <johndale@cisco.com>
> Cc: Hyong Youb Kim <hyonkim@cisco.com>
> Cc: Beilei Xing <beilei.xing@intel.com>
> Cc: Qi Zhang <qi.z.zhang@intel.com>
> Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
> Cc: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> Cc: Yongseok Koh <yskoh@mellanox.com>
> Cc: Jacek Siuda <jck@semihalf.com>
> Cc: Tomasz Duszynski <tdu@semihalf.com>
> Cc: Dmitri Epshtein <dima@marvell.com>
> Cc: Natalie Samsonov <nsamsono@marvell.com>
> Cc: Jianbo Liu <jianbo.liu@arm.com>
> Cc: Andrew Rybchenko <arybchenko@solarflare.com>
> Cc: Pascal Mazon <pascal.mazon@6wind.com>
> 
> ---
> 
> Hi PMD maintainers, while I'm pretty confident in these changes, I could
> not validate them with all devices.
> 
> It would be great if you could apply this patch, run testpmd, create VLAN
> flow rules with/without inner EtherType as described and send matching
> traffic while making sure nothing was broken in the process.
> 
> Thanks!
> ---
>  app/test-pmd/cmdline_flow.c                 | 17 +++---
>  doc/guides/nics/tap.rst                     |  2 +-
>  doc/guides/prog_guide/rte_flow.rst          | 21 +++++--
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +-
>  drivers/net/bnxt/bnxt_filter.c              | 38 ++++++++++--
>  drivers/net/enic/enic_flow.c                | 21 ++++---
>  drivers/net/i40e/i40e_flow.c                | 74 ++++++++++++++++++++----
>  drivers/net/mlx5/mlx5_flow.c                | 14 ++++-
>  drivers/net/mvpp2/mrvl_flow.c               | 27 +++++++--
>  drivers/net/sfc/sfc_flow.c                  | 27 +++++++++
>  drivers/net/tap/tap_flow.c                  | 16 +++--
>  lib/librte_ether/rte_flow.h                 | 24 +++++---
>  12 files changed, 227 insertions(+), 58 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 2fbd3d8ef..3a486032d 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -99,11 +99,11 @@ enum index {
>  	ITEM_ETH_SRC,
>  	ITEM_ETH_TYPE,
>  	ITEM_VLAN,
> -	ITEM_VLAN_TPID,
>  	ITEM_VLAN_TCI,
>  	ITEM_VLAN_PCP,
>  	ITEM_VLAN_DEI,
>  	ITEM_VLAN_VID,
> +	ITEM_VLAN_INNER_TYPE,
>  	ITEM_IPV4,
>  	ITEM_IPV4_TOS,
>  	ITEM_IPV4_TTL,
> @@ -505,11 +505,11 @@ static const enum index item_eth[] = {
>  };
>  
>  static const enum index item_vlan[] = {
> -	ITEM_VLAN_TPID,
>  	ITEM_VLAN_TCI,
>  	ITEM_VLAN_PCP,
>  	ITEM_VLAN_DEI,
>  	ITEM_VLAN_VID,
> +	ITEM_VLAN_INNER_TYPE,
>  	ITEM_NEXT,
>  	ZERO,
>  };
> @@ -1142,12 +1142,6 @@ static const struct token token_list[] = {
>  		.next = NEXT(item_vlan),
>  		.call = parse_vc,
>  	},
> -	[ITEM_VLAN_TPID] = {
> -		.name = "tpid",
> -		.help = "tag protocol identifier",
> -		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
> -		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
> -	},
>  	[ITEM_VLAN_TCI] = {
>  		.name = "tci",
>  		.help = "tag control information",
> @@ -1175,6 +1169,13 @@ static const struct token token_list[] = {
>  		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
>  						  tci, "\x0f\xff")),
>  	},
> +	[ITEM_VLAN_INNER_TYPE] = {
> +		.name = "inner_type",
> +		.help = "inner EtherType",
> +		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
> +		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan,
> +					     inner_type)),
> +	},
>  	[ITEM_IPV4] = {
>  		.name = "ipv4",
>  		.help = "match IPv4 header",
> diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst
> index 76eb0bde4..bcf3efe9e 100644
> --- a/doc/guides/nics/tap.rst
> +++ b/doc/guides/nics/tap.rst
> @@ -97,7 +97,7 @@ The kernel support can be checked with this command::
>  Supported items:
>  
>  - eth: src and dst (with variable masks), and eth_type (0xffff mask).
> -- vlan: vid, pcp, tpid, but not eid. (requires kernel 4.9)
> +- vlan: vid, pcp, but not eid. (requires kernel 4.9)
>  - ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask).
>  - udp/tcp: src and dst port (0xffff) mask.
>  
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index c893d737a..f1b9d8d76 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -784,9 +784,15 @@ Item: ``ETH``
>  
>  Matches an Ethernet header.
>  
> +The ``type`` field either stands for "EtherType" or "TPID" when followed by
> +so-called layer 2.5 pattern items such as ``RTE_FLOW_ITEM_TYPE_VLAN``. In
> +the latter case, ``type`` refers to that of the outer header, with the inner
> +EtherType/TPID provided by the subsequent pattern item. This is the same
> +order as on the wire.
> +
>  - ``dst``: destination MAC.
>  - ``src``: source MAC.
> -- ``type``: EtherType.
> +- ``type``: EtherType or TPID.
>  - Default ``mask`` matches destination and source addresses only.
>  
>  Item: ``VLAN``
> @@ -794,9 +800,13 @@ Item: ``VLAN``
>  
>  Matches an 802.1Q/ad VLAN tag.
>  
> -- ``tpid``: tag protocol identifier.
> +The corresponding standard outer EtherType (TPID) values are ``0x8100`` or
> +``0x88a8`` (in case of outer QinQ). It can be overridden by the preceding
> +pattern item.
> +
>  - ``tci``: tag control information.
> -- Default ``mask`` matches TCI only.
> +- ``inner_type``: inner EtherType or TPID.
> +- Default ``mask`` matches the VID part of TCI only (lower 12 bits).
>  
>  Item: ``IPV4``
>  ^^^^^^^^^^^^^^
> @@ -866,12 +876,15 @@ Item: ``E_TAG``
>  
>  Matches an IEEE 802.1BR E-Tag header.
>  
> -- ``tpid``: tag protocol identifier (0x893F)
> +The corresponding standard outer EtherType (TPID) value is 0x893f.
> +It can be overridden by the preceding pattern item.
> +
>  - ``epcp_edei_in_ecid_b``: E-Tag control information (E-TCI), E-PCP (3b),
>    E-DEI (1b), ingress E-CID base (12b).
>  - ``rsvd_grp_ecid_b``: reserved (2b), GRP (2b), E-CID base (12b).
>  - ``in_ecid_e``: ingress E-CID ext.
>  - ``ecid_e``: E-CID ext.
> +- ``inner_type``: inner EtherType or TPID.
>  - Default ``mask`` simultaneously matches GRP and E-CID base.
>  
>  Item: ``NVGRE``
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 738461f44..25fac8430 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -3223,15 +3223,15 @@ This section lists supported pattern items and their attributes, if any.
>  
>    - ``dst {MAC-48}``: destination MAC.
>    - ``src {MAC-48}``: source MAC.
> -  - ``type {unsigned}``: EtherType.
> +  - ``type {unsigned}``: EtherType or TPID.
>  
>  - ``vlan``: match 802.1Q/ad VLAN tag.
>  
> -  - ``tpid {unsigned}``: tag protocol identifier.
>    - ``tci {unsigned}``: tag control information.
>    - ``pcp {unsigned}``: priority code point.
>    - ``dei {unsigned}``: drop eligible indicator.
>    - ``vid {unsigned}``: VLAN identifier.
> +  - ``inner_type {unsigned}``: inner EtherType or TPID.
>  
>  - ``ipv4``: match IPv4 header.
>  
> diff --git a/drivers/net/bnxt/bnxt_filter.c b/drivers/net/bnxt/bnxt_filter.c
> index 0d735edf6..a97a8dd46 100644
> --- a/drivers/net/bnxt/bnxt_filter.c
> +++ b/drivers/net/bnxt/bnxt_filter.c
> @@ -327,6 +327,7 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
>  	uint32_t vf = 0;
>  	int use_ntuple;
>  	uint32_t en = 0;
> +	uint32_t en_ethertype;
>  	int dflt_vnic;
>  
>  	use_ntuple = bnxt_filter_type_check(pattern, error);
> @@ -336,6 +337,9 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
>  
>  	filter->filter_type = use_ntuple ?
>  		HWRM_CFA_NTUPLE_FILTER : HWRM_CFA_EM_FILTER;
> +	en_ethertype = use_ntuple ?
> +		NTUPLE_FLTR_ALLOC_INPUT_EN_ETHERTYPE :
> +		EM_FLOW_ALLOC_INPUT_EN_ETHERTYPE;
>  
>  	while (item->type != RTE_FLOW_ITEM_TYPE_END) {
>  		if (item->last) {
> @@ -405,30 +409,52 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
>  			if (eth_mask->type) {
>  				filter->ethertype =
>  					rte_be_to_cpu_16(eth_spec->type);
> -				en |= use_ntuple ?
> -					NTUPLE_FLTR_ALLOC_INPUT_EN_ETHERTYPE :
> -					EM_FLOW_ALLOC_INPUT_EN_ETHERTYPE;
> +				en |= en_ethertype;
>  			}
>  
>  			break;
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> +			if (en & en_ethertype &&
> +			    filter->ethertype != RTE_BE16(0x8100)) {
> +				rte_flow_error_set(error, EINVAL,
> +						   RTE_FLOW_ERROR_TYPE_ITEM,
> +						   item,
> +						   "unsupported outer TPID");
> +				return -rte_errno;
> +			}
>  			if (vlan_mask->tci &&
> -			    vlan_mask->tci == RTE_BE16(0x0fff) &&
> -			    !vlan_mask->tpid) {
> +			    vlan_mask->tci == RTE_BE16(0x0fff)) {
>  				/* Only the VLAN ID can be matched. */
>  				filter->l2_ovlan =
>  					rte_be_to_cpu_16(vlan_spec->tci &
>  							 RTE_BE16(0x0fff));
>  				en |= EM_FLOW_ALLOC_INPUT_EN_OVLAN_VID;
> -			} else if (vlan_mask->tci || vlan_mask->tpid) {
> +			} else if (vlan_mask->tci) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   item,
>  						   "VLAN mask is invalid");
>  				return -rte_errno;
>  			}
> +			if (vlan_mask->inner_type &&
> +			    vlan_mask->inner_type != RTE_BE16(0xffff)) {
> +				rte_flow_error_set(error, EINVAL,
> +						   RTE_FLOW_ERROR_TYPE_ITEM,
> +						   item,
> +						   "inner ethertype mask not"
> +						   " valid");
> +				return -rte_errno;
> +			}
> +			if (vlan_mask->inner_type) {
> +				filter->ethertype =
> +					rte_be_to_cpu_16(vlan_spec->inner_type);
> +				en |= en_ethertype;
> +			} else {
> +				filter->ethertype = RTE_BE16(0x0000);
> +				en &= ~en_ethertype;
> +			}
>  
>  			break;
>  		case RTE_FLOW_ITEM_TYPE_IPV4:
> diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
> index c5c98b870..a413740ab 100644
> --- a/drivers/net/enic/enic_flow.c
> +++ b/drivers/net/enic/enic_flow.c
> @@ -4,6 +4,7 @@
>  
>  #include <errno.h>
>  #include <stdint.h>
> +#include <rte_byteorder.h>
>  #include <rte_log.h>
>  #include <rte_ethdev_driver.h>
>  #include <rte_flow_driver.h>
> @@ -545,16 +546,22 @@ enic_copy_item_vlan_v2(const struct rte_flow_item *item,
>  	if (!spec)
>  		return 0;
>  
> -	/* Don't support filtering in tpid */
> -	if (mask) {
> -		if (mask->tpid != 0)
> -			return ENOTSUP;
> -	} else {
> +	if (!mask)
>  		mask = &rte_flow_item_vlan_mask;
> -		RTE_ASSERT(mask->tpid == 0);
> -	}
>  
>  	if (*inner_ofst == 0) {
> +		struct ether_hdr *eth_mask =
> +			(void *)gp->layer[FILTER_GENERIC_1_L2].mask;
> +		struct ether_hdr *eth_val =
> +			(void *)gp->layer[FILTER_GENERIC_1_L2].val;
> +
> +		/* Exactly one TPID value is allowed if specified */
> +		if ((eth_val->ether_type & eth_mask->ether_type) !=
> +		    (RTE_BE16(0x8100) & eth_mask->ether_type))
> +			return ENOTSUP;
> +		eth_mask->ether_type = mask->inner_type;
> +		eth_val->ether_type = spec->inner_type;
> +
>  		/* Outer header. Use the vlan mask/val fields */
>  		gp->mask_vlan = mask->tci;
>  		gp->val_vlan = spec->tci;
> diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/i40e/i40e_flow.c
> index 1b336df74..c6dd889ad 100644
> --- a/drivers/net/i40e/i40e_flow.c
> +++ b/drivers/net/i40e/i40e_flow.c
> @@ -2490,24 +2490,36 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
>  						      "Invalid MAC_addr mask.");
>  					return -rte_errno;
>  				}
> +			}
> +			if (eth_spec && eth_mask && eth_mask->type) {
> +				enum rte_flow_item_type next = (item + 1)->type;
>  
> -				if ((eth_mask->type & UINT16_MAX) ==
> -				    UINT16_MAX) {
> -					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> -					filter->input.flow.l2_flow.ether_type =
> -						eth_spec->type;
> +				if (eth_mask->type != RTE_BE16(0xffff)) {
> +					rte_flow_error_set(error, EINVAL,
> +						      RTE_FLOW_ERROR_TYPE_ITEM,
> +						      item,
> +						      "Invalid type mask.");
> +					return -rte_errno;
>  				}
>  
>  				ether_type = rte_be_to_cpu_16(eth_spec->type);
> -				if (ether_type == ETHER_TYPE_IPv4 ||
> -				    ether_type == ETHER_TYPE_IPv6 ||
> -				    ether_type == ETHER_TYPE_ARP ||
> -				    ether_type == outer_tpid) {
> +
> +				if ((next == RTE_FLOW_ITEM_TYPE_VLAN &&
> +				     ether_type != outer_tpid) ||
> +				    (next != RTE_FLOW_ITEM_TYPE_VLAN &&
> +				     (ether_type == ETHER_TYPE_IPv4 ||
> +				      ether_type == ETHER_TYPE_IPv6 ||
> +				      ether_type == ETHER_TYPE_ARP ||
> +				      ether_type == outer_tpid))) {
>  					rte_flow_error_set(error, EINVAL,
>  						     RTE_FLOW_ERROR_TYPE_ITEM,
>  						     item,
>  						     "Unsupported ether_type.");
>  					return -rte_errno;
> +				} else if (next != RTE_FLOW_ITEM_TYPE_VLAN) {
> +					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> +					filter->input.flow.l2_flow.ether_type =
> +						eth_spec->type;
>  				}
>  			}
>  
> @@ -2518,6 +2530,14 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> +
> +			if (input_set & I40E_INSET_LAST_ETHER_TYPE) {
> +				rte_flow_error_set(error, EINVAL,
> +						   RTE_FLOW_ERROR_TYPE_ITEM,
> +						   item,
> +						   "Unsupported outer TPID.");
> +				return -rte_errno;
> +			}
>  			if (vlan_spec && vlan_mask) {
>  				if (vlan_mask->tci ==
>  				    rte_cpu_to_be_16(I40E_TCI_MASK)) {
> @@ -2526,6 +2546,33 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
>  						vlan_spec->tci;
>  				}
>  			}
> +			if (vlan_spec && vlan_mask && vlan_mask->inner_type) {
> +				if (vlan_mask->inner_type != RTE_BE16(0xffff)) {
> +					rte_flow_error_set(error, EINVAL,
> +						      RTE_FLOW_ERROR_TYPE_ITEM,
> +						      item,
> +						      "Invalid inner_type"
> +						      " mask.");
> +					return -rte_errno;
> +				}
> +
> +				ether_type =
> +					rte_be_to_cpu_16(vlan_spec->inner_type);
> +
> +				if (ether_type == ETHER_TYPE_IPv4 ||
> +				    ether_type == ETHER_TYPE_IPv6 ||
> +				    ether_type == ETHER_TYPE_ARP ||
> +				    ether_type == outer_tpid) {
> +					rte_flow_error_set(error, EINVAL,
> +						     RTE_FLOW_ERROR_TYPE_ITEM,
> +						     item,
> +						     "Unsupported inner_type.");
> +					return -rte_errno;
> +				}
> +				input_set |= I40E_INSET_LAST_ETHER_TYPE;
> +				filter->input.flow.l2_flow.ether_type =
> +					vlan_spec->inner_type;
> +			}
>  
>  			pctype = I40E_FILTER_PCTYPE_L2_PAYLOAD;
>  			layer_idx = I40E_FLXPLD_L2_IDX;
> @@ -3284,7 +3331,8 @@ i40e_flow_parse_vxlan_pattern(__rte_unused struct rte_eth_dev *dev,
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> -			if (!(vlan_spec && vlan_mask)) {
> +			if (!(vlan_spec && vlan_mask) ||
> +			    vlan_mask->inner_type) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   item,
> @@ -3514,7 +3562,8 @@ i40e_flow_parse_nvgre_pattern(__rte_unused struct rte_eth_dev *dev,
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> -			if (!(vlan_spec && vlan_mask)) {
> +			if (!(vlan_spec && vlan_mask) ||
> +			    vlan_mask->inner_type) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   item,
> @@ -4022,7 +4071,8 @@ i40e_flow_parse_qinq_pattern(__rte_unused struct rte_eth_dev *dev,
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
>  
> -			if (!(vlan_spec && vlan_mask)) {
> +			if (!(vlan_spec && vlan_mask) ||
> +			    vlan_mask->inner_type) {
>  				rte_flow_error_set(error, EINVAL,
>  					   RTE_FLOW_ERROR_TYPE_ITEM,
>  					   item,
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index bc1176819..f03413d32 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -17,6 +17,7 @@
>  #pragma GCC diagnostic error "-Wpedantic"
>  #endif
>  
> +#include <rte_byteorder.h>
>  #include <rte_common.h>
>  #include <rte_eth_ctrl.h>
>  #include <rte_ethdev_driver.h>
> @@ -306,6 +307,7 @@ static const struct mlx5_flow_items mlx5_flow_items[] = {
>  		.actions = valid_actions,
>  		.mask = &(const struct rte_flow_item_vlan){
>  			.tci = -1,
> +			.inner_type = -1,
>  		},
>  		.default_mask = &rte_flow_item_vlan_mask,
>  		.mask_sz = sizeof(struct rte_flow_item_vlan),
> @@ -1285,6 +1287,7 @@ mlx5_flow_create_vlan(const struct rte_flow_item *item,
>  	struct mlx5_flow_parse *parser = data->parser;
>  	struct ibv_flow_spec_eth *eth;
>  	const unsigned int eth_size = sizeof(struct ibv_flow_spec_eth);
> +	const char *msg = "VLAN cannot be empty";
>  
>  	if (spec) {
>  		unsigned int i;
> @@ -1306,12 +1309,21 @@ mlx5_flow_create_vlan(const struct rte_flow_item *item,
>  			 */
>  			if (!eth->mask.vlan_tag)
>  				goto error;
> +			/* Exactly one TPID value is allowed if specified. */
> +			if ((eth->val.ether_type & eth->mask.ether_type) !=
> +			    (RTE_BE16(0x8100) & eth->mask.ether_type)) {

You can use ETHER_TYPE_VLAN present in rte_ether.h instead of hard coded
values.

> +				msg = "unsupported outer TPID";
> +				goto error;
> +			}
> +			eth->val.ether_type = spec->inner_type;
> +			eth->mask.ether_type = mask->inner_type;
> +			eth->val.ether_type &= eth->mask.ether_type;
>  		}
>  		return 0;
>  	}
>  error:
>  	return rte_flow_error_set(data->error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
> -				  item, "VLAN cannot be empty");
> +				  item, msg);
>  }
>  
>  /**
> diff --git a/drivers/net/mvpp2/mrvl_flow.c b/drivers/net/mvpp2/mrvl_flow.c
> index 8fd4dbfb1..6604a411f 100644
> --- a/drivers/net/mvpp2/mrvl_flow.c
> +++ b/drivers/net/mvpp2/mrvl_flow.c
> @@ -1091,12 +1091,6 @@ mrvl_parse_vlan(const struct rte_flow_item *item,
>  	if (ret)
>  		return ret;
>  
> -	if (mask->tpid) {
> -		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
> -				   NULL, "Not supported by classifier\n");
> -		return -rte_errno;
> -	}
> -
>  	m = rte_be_to_cpu_16(mask->tci);
>  	if (m & MRVL_VLAN_ID_MASK) {
>  		RTE_LOG(WARNING, PMD, "vlan id mask is ignored\n");
> @@ -1112,6 +1106,27 @@ mrvl_parse_vlan(const struct rte_flow_item *item,
>  			goto out;
>  	}
>  
> +	if (flow->pattern & F_TYPE) {
> +		rte_flow_error_set(error, ENOTSUP,
> +				   RTE_FLOW_ERROR_TYPE_ITEM, item,
> +				   "outer TPID cannot be explicitly matched"
> +				   " when VLAN item is also specified\n");
> +		return -rte_errno;
> +	}
> +	if (mask->inner_type) {
> +		struct rte_flow_item_eth spec_eth = {
> +			.type = spec->inner_type,
> +		};
> +		struct rte_flow_item_eth mask_eth = {
> +			.type = mask->inner_type,
> +		};
> +
> +		RTE_LOG(WARNING, PMD, "inner eth type mask is ignored\n");
> +		ret = mrvl_parse_type(spec_eth, mask_eth, flow);
> +		if (ret)
> +			goto out;
> +	}
> +
>  	return 0;
>  out:
>  	rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
> diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
> index bf9609735..515a61325 100644
> --- a/drivers/net/sfc/sfc_flow.c
> +++ b/drivers/net/sfc/sfc_flow.c
> @@ -352,6 +352,7 @@ sfc_flow_parse_vlan(const struct rte_flow_item *item,
>  	const struct rte_flow_item_vlan *mask = NULL;
>  	const struct rte_flow_item_vlan supp_mask = {
>  		.tci = rte_cpu_to_be_16(ETH_VLAN_ID_MAX),
> +		.inner_type = RTE_BE16(0xffff),
>  	};
>  
>  	rc = sfc_flow_parse_init(item,
> @@ -394,6 +395,32 @@ sfc_flow_parse_vlan(const struct rte_flow_item *item,
>  		return -rte_errno;
>  	}
>  
> +	/*
> +	 * If an EtherType was already specified, make sure it is a valid
> +	 * TPID for the current VLAN layer before overwriting it with the
> +	 * specified inner type.
> +	 */
> +	if (efx_spec->efs_match_flags & EFX_FILTER_MATCH_ETHER_TYPE &&
> +	    efx_spec->efs_ether_type != RTE_BE16(0x8100) &&
> +	    efx_spec->efs_ether_type != RTE_BE16(0x88a8)) {
> +		rte_flow_error_set(error, EINVAL,
> +				   RTE_FLOW_ERROR_TYPE_ITEM, item,
> +				   "Unsupported outer TPID");
> +		return -rte_errno;
> +	}
> +	if (!mask->inner_type) {
> +		efx_spec->efs_match_flags &= ~EFX_FILTER_MATCH_ETHER_TYPE;
> +		efx_spec->efs_ether_type = RTE_BE16(0x0000);
> +	} else if (mask->inner_type == supp_mask.inner_type) {
> +		efx_spec->efs_match_flags |= EFX_FILTER_MATCH_ETHER_TYPE;
> +		efx_spec->efs_ether_type = rte_be_to_cpu_16(spec->inner_type);
> +	} else {
> +		rte_flow_error_set(error, EINVAL,
> +				   RTE_FLOW_ERROR_TYPE_ITEM, item,
> +				   "Bad mask for VLAN inner_type");
> +		return -rte_errno;
> +	}
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
> index e5eb50fc5..e53eff6ce 100644
> --- a/drivers/net/tap/tap_flow.c
> +++ b/drivers/net/tap/tap_flow.c
> @@ -270,13 +270,13 @@ static const struct tap_flow_items tap_flow_items[] = {
>  		.items = ITEMS(RTE_FLOW_ITEM_TYPE_IPV4,
>  			       RTE_FLOW_ITEM_TYPE_IPV6),
>  		.mask = &(const struct rte_flow_item_vlan){
> -			.tpid = -1,
>  			/* DEI matching is not supported */
>  #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
>  			.tci = 0xffef,
>  #else
>  			.tci = 0xefff,
>  #endif
> +			.inner_type = -1,
>  		},
>  		.mask_sz = sizeof(struct rte_flow_item_vlan),
>  		.default_mask = &rte_flow_item_vlan_mask,
> @@ -578,13 +578,21 @@ tap_flow_create_vlan(const struct rte_flow_item *item, void *data)
>  	/* use default mask if none provided */
>  	if (!mask)
>  		mask = tap_flow_items[RTE_FLOW_ITEM_TYPE_VLAN].default_mask;
> -	/* TC does not support tpid masking. Only accept if exact match. */
> -	if (mask->tpid && mask->tpid != 0xffff)
> +	/* check that previous eth type is compatible with VLAN */
> +	if (info->eth_type && info->eth_type != RTE_BE16(ETH_P_8021Q))
>  		return -1;
>  	/* Double-tagging not supported. */
> -	if (spec && mask->tpid && spec->tpid != htons(ETH_P_8021Q))
> +	if (info->vlan)
>  		return -1;
>  	info->vlan = 1;
> +	if (mask->inner_type) {
> +		/* TC does not support partial eth_type masking */
> +		if (mask->inner_type != RTE_BE16(0xffff))
> +			return -1;
> +		info->eth_type = spec->inner_type;
> +	} else {
> +		info->eth_type = 0;
> +	}
>  	if (!flow)
>  		return 0;
>  	msg = &flow->msg;
> diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
> index 1b222ba60..15e383f95 100644
> --- a/lib/librte_ether/rte_flow.h
> +++ b/lib/librte_ether/rte_flow.h
> @@ -454,11 +454,17 @@ static const struct rte_flow_item_raw rte_flow_item_raw_mask = {
>   * RTE_FLOW_ITEM_TYPE_ETH
>   *
>   * Matches an Ethernet header.
> + *
> + * The @p type field either stands for "EtherType" or "TPID" when followed
> + * by so-called layer 2.5 pattern items such as RTE_FLOW_ITEM_TYPE_VLAN. In
> + * the latter case, @p type refers to that of the outer header, with the
> + * inner EtherType/TPID provided by the subsequent pattern item. This is the
> + * same order as on the wire.
>   */
>  struct rte_flow_item_eth {
>  	struct ether_addr dst; /**< Destination MAC. */
>  	struct ether_addr src; /**< Source MAC. */
> -	rte_be16_t type; /**< EtherType. */
> +	rte_be16_t type; /**< EtherType or TPID. */
>  };
>  
>  /** Default mask for RTE_FLOW_ITEM_TYPE_ETH. */
> @@ -475,19 +481,20 @@ static const struct rte_flow_item_eth rte_flow_item_eth_mask = {
>   *
>   * Matches an 802.1Q/ad VLAN tag.
>   *
> - * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
> - * RTE_FLOW_ITEM_TYPE_VLAN.
> + * The corresponding standard outer EtherType (TPID) values are 0x8100 or
> + * 0x88a8 (in case of outer QinQ). It can be overridden by the preceding
> + * pattern item.
>   */
>  struct rte_flow_item_vlan {
> -	rte_be16_t tpid; /**< Tag protocol identifier. */
>  	rte_be16_t tci; /**< Tag control information. */
> +	rte_be16_t inner_type; /**< Inner EtherType or TPID. */
>  };
>  
>  /** Default mask for RTE_FLOW_ITEM_TYPE_VLAN. */
>  #ifndef __cplusplus
>  static const struct rte_flow_item_vlan rte_flow_item_vlan_mask = {
> -	.tpid = RTE_BE16(0x0000),
> -	.tci = RTE_BE16(0xffff),
> +	.tci = RTE_BE16(0x0fff),
> +	.inner_type = RTE_BE16(0x0000),
>  };
>  #endif
>  
> @@ -636,9 +643,11 @@ static const struct rte_flow_item_vxlan rte_flow_item_vxlan_mask = {
>   * RTE_FLOW_ITEM_TYPE_E_TAG.
>   *
>   * Matches a E-tag header.
> + *
> + * The corresponding standard outer EtherType (TPID) value is 0x893f.
> + * It can be overridden by the preceding pattern item.
>   */
>  struct rte_flow_item_e_tag {
> -	rte_be16_t tpid; /**< Tag protocol identifier (0x893F). */
>  	/**
>  	 * E-Tag control information (E-TCI).
>  	 * E-PCP (3b), E-DEI (1b), ingress E-CID base (12b).
> @@ -648,6 +657,7 @@ struct rte_flow_item_e_tag {
>  	rte_be16_t rsvd_grp_ecid_b;
>  	uint8_t in_ecid_e; /**< Ingress E-CID ext. */
>  	uint8_t ecid_e; /**< E-CID ext. */
> +	rte_be16_t inner_type; /**< Inner EtherType or TPID. */
>  };
>  
>  /** Default mask for RTE_FLOW_ITEM_TYPE_E_TAG. */
> -- 
> 2.11.0
  
Adrien Mazarguil April 5, 2018, 12:02 p.m. UTC | #2
On Thu, Apr 05, 2018 at 11:55:10AM +0200, Nélio Laranjeiro wrote:
> On Wed, Apr 04, 2018 at 05:56:49PM +0200, Adrien Mazarguil wrote:
> > TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
> > consistent with the normal stacking order of pattern items, which is
> > confusing to applications.
> > 
> > Problem is that when followed by one of these layers, the EtherType field
> > of the preceding layer keeps its "inner" definition, and the "outer" TPID
> > is provided by the subsequent layer, the reverse of how a packet looks like
> > on the wire:
> > 
> >  Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
> >  rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]
> > 
> > Worse, when QinQ is involved, the stacking order of VLAN layers is
> > unspecified. It is unclear whether it should be reversed (innermost to
> > outermost) as well given TPID applies to the previous layer:
> > 
> >  Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
> >  rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
> >  rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]
> > 
> > While specifying EtherType/TPID is hopefully rarely necessary, the stacking
> > order in case of QinQ and the lack of documentation remain an issue.
> > 
> > This patch replaces TPID in the VLAN pattern item with an inner
> > EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
> > clarifies documentation and updates all relevant code.
> > 
> > Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
> > items:
> > 
> > - bnxt: EtherType matching is supported, and vlan->inner_type overrides
> >   eth->type if the latter has standard TPID value 0x8100, otherwise an
> >   error is triggered.
> > 
> > - e1000: EtherType matching is only supported with the ETHERTYPE filter,
> >   which does not support VLAN matching, therefore no impact.
> > 
> > - enic: same as bnxt.
> > 
> > - i40e: same as bnxt with a configurable TPID value for the FDIR filter,
> >   with existing limitations on allowed EtherType values. The remaining
> >   filter types (VXLAN, NVGRE, QINQ) do not support EtherType matching.
> > 
> > - ixgbe: same as e1000.
> > 
> > - mlx4: EtherType/TPID matching is not supported, no impact.
> > 
> > - mlx5: same as bnxt.
> > 
> > - mrvl: EtherType matching is supported but eth->type cannot be specified
> >   when a VLAN item is present. However vlan->inner_type is used if
> >   specified.
> > 
> > - sfc: same as bnxt with QinQ TPID value 0x88a8 additionally supported.
> > 
> > - tap: same as bnxt.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
<snip>
> > @@ -1306,12 +1309,21 @@ mlx5_flow_create_vlan(const struct rte_flow_item *item,
> >  			 */
> >  			if (!eth->mask.vlan_tag)
> >  				goto error;
> > +			/* Exactly one TPID value is allowed if specified. */
> > +			if ((eth->val.ether_type & eth->mask.ether_type) !=
> > +			    (RTE_BE16(0x8100) & eth->mask.ether_type)) {
> 
> You can use ETHER_TYPE_VLAN present in rte_ether.h instead of hard coded
> values.

Good catch, I forgot about those macros. I'll make the change in all
impacted code in the next revision.
  
Qi Zhang April 5, 2018, 2:20 p.m. UTC | #3
Hi Adrien:

> Hi PMD maintainers, while I'm pretty confident in these changes, I could not
> validate them with all devices.
> 
> It would be great if you could apply this patch, run testpmd, create VLAN flow
> rules with/without inner EtherType as described and send matching traffic
> while making sure nothing was broken in the process.
> 
> Thanks!
> ---
> diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/i40e/i40e_flow.c index
> 1b336df74..c6dd889ad 100644
> --- a/drivers/net/i40e/i40e_flow.c
> +++ b/drivers/net/i40e/i40e_flow.c
> @@ -2490,24 +2490,36 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev
> *dev,
>  						      "Invalid MAC_addr mask.");
>  					return -rte_errno;
>  				}
> +			}
> +			if (eth_spec && eth_mask && eth_mask->type) {
> +				enum rte_flow_item_type next = (item + 1)->type;
> 
> -				if ((eth_mask->type & UINT16_MAX) ==
> -				    UINT16_MAX) {
> -					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> -					filter->input.flow.l2_flow.ether_type =
> -						eth_spec->type;
> +				if (eth_mask->type != RTE_BE16(0xffff)) {
> +					rte_flow_error_set(error, EINVAL,
> +						      RTE_FLOW_ERROR_TYPE_ITEM,
> +						      item,
> +						      "Invalid type mask.");
> +					return -rte_errno;
>  				}
> 
>  				ether_type = rte_be_to_cpu_16(eth_spec->type);
> -				if (ether_type == ETHER_TYPE_IPv4 ||
> -				    ether_type == ETHER_TYPE_IPv6 ||
> -				    ether_type == ETHER_TYPE_ARP ||
> -				    ether_type == outer_tpid) {
> +
> +				if ((next == RTE_FLOW_ITEM_TYPE_VLAN &&
> +				     ether_type != outer_tpid) ||
> +				    (next != RTE_FLOW_ITEM_TYPE_VLAN &&
> +				     (ether_type == ETHER_TYPE_IPv4 ||
> +				      ether_type == ETHER_TYPE_IPv6 ||
> +				      ether_type == ETHER_TYPE_ARP ||
> +				      ether_type == outer_tpid))) {
>  					rte_flow_error_set(error, EINVAL,
>  						     RTE_FLOW_ERROR_TYPE_ITEM,
>  						     item,
>  						     "Unsupported ether_type.");
>  					return -rte_errno;
> +				} else if (next != RTE_FLOW_ITEM_TYPE_VLAN) {
> +					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> +					filter->input.flow.l2_flow.ether_type =
> +						eth_spec->type;
>  				}
>  			}
> 
> @@ -2518,6 +2530,14 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev
> *dev,
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> +
> +			if (input_set & I40E_INSET_LAST_ETHER_TYPE) {
> +				rte_flow_error_set(error, EINVAL,
> +						   RTE_FLOW_ERROR_TYPE_ITEM,
> +						   item,
> +						   "Unsupported outer TPID.");
> +				return -rte_errno;
> +			}

Please correct me I'm wrong, now I40E_INSET_LAST_ETHER_TYPE will only be set when next != RTE_FLOW_ITEM_TYPE_VLAN
while, RTE_FLOW_ITEM_TYPE_VLAN will only be the next to RTE_FLOW_ITEM_TYPE_ETH for fdir, so above check seems unnecessary ?

Thanks
Qi

>  			if (vlan_spec && vlan_mask) {
>  				if (vlan_mask->tci ==
>  				    rte_cpu_to_be_16(I40E_TCI_MASK)) { @@ -2526,6
> +2546,33 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
>  						vlan_spec->tci;
>  				}
>  			}
> +			if (vlan_spec && vlan_mask && vlan_mask->inner_type) {
> +				if (vlan_mask->inner_type != RTE_BE16(0xffff)) {
> +					rte_flow_error_set(error, EINVAL,
> +						      RTE_FLOW_ERROR_TYPE_ITEM,
> +						      item,
> +						      "Invalid inner_type"
> +						      " mask.");
> +					return -rte_errno;
> +				}
> +
> +				ether_type =
> +					rte_be_to_cpu_16(vlan_spec->inner_type);
> +
> +				if (ether_type == ETHER_TYPE_IPv4 ||
> +				    ether_type == ETHER_TYPE_IPv6 ||
> +				    ether_type == ETHER_TYPE_ARP ||
> +				    ether_type == outer_tpid) {
> +					rte_flow_error_set(error, EINVAL,
> +						     RTE_FLOW_ERROR_TYPE_ITEM,
> +						     item,
> +						     "Unsupported inner_type.");
> +					return -rte_errno;
> +				}
> +				input_set |= I40E_INSET_LAST_ETHER_TYPE;
> +				filter->input.flow.l2_flow.ether_type =
> +					vlan_spec->inner_type;
> +			}
> 
>  			pctype = I40E_FILTER_PCTYPE_L2_PAYLOAD;
>  			layer_idx = I40E_FLXPLD_L2_IDX;
> @@ -3284,7 +3331,8 @@ i40e_flow_parse_vxlan_pattern(__rte_unused struct
> rte_eth_dev *dev,
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> -			if (!(vlan_spec && vlan_mask)) {
> +			if (!(vlan_spec && vlan_mask) ||
> +			    vlan_mask->inner_type) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   item,
> @@ -3514,7 +3562,8 @@ i40e_flow_parse_nvgre_pattern(__rte_unused
> struct rte_eth_dev *dev,
>  		case RTE_FLOW_ITEM_TYPE_VLAN:
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> -			if (!(vlan_spec && vlan_mask)) {
> +			if (!(vlan_spec && vlan_mask) ||
> +			    vlan_mask->inner_type) {
>  				rte_flow_error_set(error, EINVAL,
>  						   RTE_FLOW_ERROR_TYPE_ITEM,
>  						   item,
> @@ -4022,7 +4071,8 @@ i40e_flow_parse_qinq_pattern(__rte_unused struct
> rte_eth_dev *dev,
>  			vlan_spec = item->spec;
>  			vlan_mask = item->mask;
> 
> -			if (!(vlan_spec && vlan_mask)) {
> +			if (!(vlan_spec && vlan_mask) ||
> +			    vlan_mask->inner_type) {
>  				rte_flow_error_set(error, EINVAL,
>  					   RTE_FLOW_ERROR_TYPE_ITEM,
>  					   item,
  
Adrien Mazarguil April 5, 2018, 2:39 p.m. UTC | #4
On Thu, Apr 05, 2018 at 02:20:34PM +0000, Zhang, Qi Z wrote:
> Hi Adrien:
> 
> > Hi PMD maintainers, while I'm pretty confident in these changes, I could not
> > validate them with all devices.
> > 
> > It would be great if you could apply this patch, run testpmd, create VLAN flow
> > rules with/without inner EtherType as described and send matching traffic
> > while making sure nothing was broken in the process.
> > 
> > Thanks!
> > ---
> > diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/i40e/i40e_flow.c index
> > 1b336df74..c6dd889ad 100644
> > --- a/drivers/net/i40e/i40e_flow.c
> > +++ b/drivers/net/i40e/i40e_flow.c
> > @@ -2490,24 +2490,36 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev
> > *dev,
> >  						      "Invalid MAC_addr mask.");
> >  					return -rte_errno;
> >  				}
> > +			}
> > +			if (eth_spec && eth_mask && eth_mask->type) {
> > +				enum rte_flow_item_type next = (item + 1)->type;
> > 
> > -				if ((eth_mask->type & UINT16_MAX) ==
> > -				    UINT16_MAX) {
> > -					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> > -					filter->input.flow.l2_flow.ether_type =
> > -						eth_spec->type;
> > +				if (eth_mask->type != RTE_BE16(0xffff)) {
> > +					rte_flow_error_set(error, EINVAL,
> > +						      RTE_FLOW_ERROR_TYPE_ITEM,
> > +						      item,
> > +						      "Invalid type mask.");
> > +					return -rte_errno;
> >  				}
> > 
> >  				ether_type = rte_be_to_cpu_16(eth_spec->type);
> > -				if (ether_type == ETHER_TYPE_IPv4 ||
> > -				    ether_type == ETHER_TYPE_IPv6 ||
> > -				    ether_type == ETHER_TYPE_ARP ||
> > -				    ether_type == outer_tpid) {
> > +
> > +				if ((next == RTE_FLOW_ITEM_TYPE_VLAN &&
> > +				     ether_type != outer_tpid) ||
> > +				    (next != RTE_FLOW_ITEM_TYPE_VLAN &&
> > +				     (ether_type == ETHER_TYPE_IPv4 ||
> > +				      ether_type == ETHER_TYPE_IPv6 ||
> > +				      ether_type == ETHER_TYPE_ARP ||
> > +				      ether_type == outer_tpid))) {
> >  					rte_flow_error_set(error, EINVAL,
> >  						     RTE_FLOW_ERROR_TYPE_ITEM,
> >  						     item,
> >  						     "Unsupported ether_type.");
> >  					return -rte_errno;
> > +				} else if (next != RTE_FLOW_ITEM_TYPE_VLAN) {
> > +					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> > +					filter->input.flow.l2_flow.ether_type =
> > +						eth_spec->type;
> >  				}
> >  			}
> > 
> > @@ -2518,6 +2530,14 @@ i40e_flow_parse_fdir_pattern(struct rte_eth_dev
> > *dev,
> >  		case RTE_FLOW_ITEM_TYPE_VLAN:
> >  			vlan_spec = item->spec;
> >  			vlan_mask = item->mask;
> > +
> > +			if (input_set & I40E_INSET_LAST_ETHER_TYPE) {
> > +				rte_flow_error_set(error, EINVAL,
> > +						   RTE_FLOW_ERROR_TYPE_ITEM,
> > +						   item,
> > +						   "Unsupported outer TPID.");
> > +				return -rte_errno;
> > +			}
> 
> Please correct me I'm wrong, now I40E_INSET_LAST_ETHER_TYPE will only be set when next != RTE_FLOW_ITEM_TYPE_VLAN
> while, RTE_FLOW_ITEM_TYPE_VLAN will only be the next to RTE_FLOW_ITEM_TYPE_ETH for fdir, so above check seems unnecessary ?

You're right, it's unnecessary. I can remove this change or leave it as a
safety measure for future contributions, since when parsing a VLAN item
one is not necessarily aware of what happened in previous iterations.

How about an assertion check for debugging purposes instead?

 RTE_ASSERT(!(input_set & I40E_INSET_LAST_ETHER_TYPE));
  
Qi Zhang April 6, 2018, 4:59 a.m. UTC | #5
> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Thursday, April 5, 2018 10:39 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; dev@dpdk.org; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Ajit Khaparde
> <ajit.khaparde@broadcom.com>; Somnath Kotur
> <somnath.kotur@broadcom.com>; John Daley <johndale@cisco.com>; Hyong
> Youb Kim <hyonkim@cisco.com>; Xing, Beilei <beilei.xing@intel.com>; Ananyev,
> Konstantin <konstantin.ananyev@intel.com>; Nelio Laranjeiro
> <nelio.laranjeiro@6wind.com>; Yongseok Koh <yskoh@mellanox.com>; Jacek
> Siuda <jck@semihalf.com>; Tomasz Duszynski <tdu@semihalf.com>; Dmitri
> Epshtein <dima@marvell.com>; Natalie Samsonov <nsamsono@marvell.com>;
> Jianbo Liu <jianbo.liu@arm.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Pascal Mazon <pascal.mazon@6wind.com>
> Subject: Re: [PATCH v1 11/16] ethdev: refine TPID handling in flow API
> 
> On Thu, Apr 05, 2018 at 02:20:34PM +0000, Zhang, Qi Z wrote:
> > Hi Adrien:
> >
> > > Hi PMD maintainers, while I'm pretty confident in these changes, I
> > > could not validate them with all devices.
> > >
> > > It would be great if you could apply this patch, run testpmd, create
> > > VLAN flow rules with/without inner EtherType as described and send
> > > matching traffic while making sure nothing was broken in the process.
> > >
> > > Thanks!
> > > ---
> > > diff --git a/drivers/net/i40e/i40e_flow.c
> > > b/drivers/net/i40e/i40e_flow.c index 1b336df74..c6dd889ad 100644
> > > --- a/drivers/net/i40e/i40e_flow.c
> > > +++ b/drivers/net/i40e/i40e_flow.c
> > > @@ -2490,24 +2490,36 @@ i40e_flow_parse_fdir_pattern(struct
> > > rte_eth_dev *dev,
> > >  						      "Invalid MAC_addr mask.");
> > >  					return -rte_errno;
> > >  				}
> > > +			}
> > > +			if (eth_spec && eth_mask && eth_mask->type) {
> > > +				enum rte_flow_item_type next = (item + 1)->type;
> > >
> > > -				if ((eth_mask->type & UINT16_MAX) ==
> > > -				    UINT16_MAX) {
> > > -					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> > > -					filter->input.flow.l2_flow.ether_type =
> > > -						eth_spec->type;
> > > +				if (eth_mask->type != RTE_BE16(0xffff)) {
> > > +					rte_flow_error_set(error, EINVAL,
> > > +						      RTE_FLOW_ERROR_TYPE_ITEM,
> > > +						      item,
> > > +						      "Invalid type mask.");
> > > +					return -rte_errno;
> > >  				}
> > >
> > >  				ether_type = rte_be_to_cpu_16(eth_spec->type);
> > > -				if (ether_type == ETHER_TYPE_IPv4 ||
> > > -				    ether_type == ETHER_TYPE_IPv6 ||
> > > -				    ether_type == ETHER_TYPE_ARP ||
> > > -				    ether_type == outer_tpid) {
> > > +
> > > +				if ((next == RTE_FLOW_ITEM_TYPE_VLAN &&
> > > +				     ether_type != outer_tpid) ||
> > > +				    (next != RTE_FLOW_ITEM_TYPE_VLAN &&
> > > +				     (ether_type == ETHER_TYPE_IPv4 ||
> > > +				      ether_type == ETHER_TYPE_IPv6 ||
> > > +				      ether_type == ETHER_TYPE_ARP ||
> > > +				      ether_type == outer_tpid))) {
> > >  					rte_flow_error_set(error, EINVAL,
> > >  						     RTE_FLOW_ERROR_TYPE_ITEM,
> > >  						     item,
> > >  						     "Unsupported ether_type.");
> > >  					return -rte_errno;
> > > +				} else if (next != RTE_FLOW_ITEM_TYPE_VLAN) {
> > > +					input_set |= I40E_INSET_LAST_ETHER_TYPE;
> > > +					filter->input.flow.l2_flow.ether_type =
> > > +						eth_spec->type;
> > >  				}
> > >  			}
> > >
> > > @@ -2518,6 +2530,14 @@ i40e_flow_parse_fdir_pattern(struct
> > > rte_eth_dev *dev,
> > >  		case RTE_FLOW_ITEM_TYPE_VLAN:
> > >  			vlan_spec = item->spec;
> > >  			vlan_mask = item->mask;
> > > +
> > > +			if (input_set & I40E_INSET_LAST_ETHER_TYPE) {
> > > +				rte_flow_error_set(error, EINVAL,
> > > +						   RTE_FLOW_ERROR_TYPE_ITEM,
> > > +						   item,
> > > +						   "Unsupported outer TPID.");
> > > +				return -rte_errno;
> > > +			}
> >
> > Please correct me I'm wrong, now I40E_INSET_LAST_ETHER_TYPE will only
> > be set when next != RTE_FLOW_ITEM_TYPE_VLAN while,
> RTE_FLOW_ITEM_TYPE_VLAN will only be the next to
> RTE_FLOW_ITEM_TYPE_ETH for fdir, so above check seems unnecessary ?
> 
> You're right, it's unnecessary. I can remove this change or leave it as a safety
> measure for future contributions, since when parsing a VLAN item one is not
> necessarily aware of what happened in previous iterations.
> 
> How about an assertion check for debugging purposes instead?
> 
>  RTE_ASSERT(!(input_set & I40E_INSET_LAST_ETHER_TYPE));

Agree to use assert.

> 
> --
> Adrien Mazarguil
> 6WIND
  

Patch

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 2fbd3d8ef..3a486032d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,11 +99,11 @@  enum index {
 	ITEM_ETH_SRC,
 	ITEM_ETH_TYPE,
 	ITEM_VLAN,
-	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
 	ITEM_VLAN_PCP,
 	ITEM_VLAN_DEI,
 	ITEM_VLAN_VID,
+	ITEM_VLAN_INNER_TYPE,
 	ITEM_IPV4,
 	ITEM_IPV4_TOS,
 	ITEM_IPV4_TTL,
@@ -505,11 +505,11 @@  static const enum index item_eth[] = {
 };
 
 static const enum index item_vlan[] = {
-	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
 	ITEM_VLAN_PCP,
 	ITEM_VLAN_DEI,
 	ITEM_VLAN_VID,
+	ITEM_VLAN_INNER_TYPE,
 	ITEM_NEXT,
 	ZERO,
 };
@@ -1142,12 +1142,6 @@  static const struct token token_list[] = {
 		.next = NEXT(item_vlan),
 		.call = parse_vc,
 	},
-	[ITEM_VLAN_TPID] = {
-		.name = "tpid",
-		.help = "tag protocol identifier",
-		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
-		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
-	},
 	[ITEM_VLAN_TCI] = {
 		.name = "tci",
 		.help = "tag control information",
@@ -1175,6 +1169,13 @@  static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
 						  tci, "\x0f\xff")),
 	},
+	[ITEM_VLAN_INNER_TYPE] = {
+		.name = "inner_type",
+		.help = "inner EtherType",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan,
+					     inner_type)),
+	},
 	[ITEM_IPV4] = {
 		.name = "ipv4",
 		.help = "match IPv4 header",
diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst
index 76eb0bde4..bcf3efe9e 100644
--- a/doc/guides/nics/tap.rst
+++ b/doc/guides/nics/tap.rst
@@ -97,7 +97,7 @@  The kernel support can be checked with this command::
 Supported items:
 
 - eth: src and dst (with variable masks), and eth_type (0xffff mask).
-- vlan: vid, pcp, tpid, but not eid. (requires kernel 4.9)
+- vlan: vid, pcp, but not eid. (requires kernel 4.9)
 - ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask).
 - udp/tcp: src and dst port (0xffff) mask.
 
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index c893d737a..f1b9d8d76 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -784,9 +784,15 @@  Item: ``ETH``
 
 Matches an Ethernet header.
 
+The ``type`` field either stands for "EtherType" or "TPID" when followed by
+so-called layer 2.5 pattern items such as ``RTE_FLOW_ITEM_TYPE_VLAN``. In
+the latter case, ``type`` refers to that of the outer header, with the inner
+EtherType/TPID provided by the subsequent pattern item. This is the same
+order as on the wire.
+
 - ``dst``: destination MAC.
 - ``src``: source MAC.
-- ``type``: EtherType.
+- ``type``: EtherType or TPID.
 - Default ``mask`` matches destination and source addresses only.
 
 Item: ``VLAN``
@@ -794,9 +800,13 @@  Item: ``VLAN``
 
 Matches an 802.1Q/ad VLAN tag.
 
-- ``tpid``: tag protocol identifier.
+The corresponding standard outer EtherType (TPID) values are ``0x8100`` or
+``0x88a8`` (in case of outer QinQ). It can be overridden by the preceding
+pattern item.
+
 - ``tci``: tag control information.
-- Default ``mask`` matches TCI only.
+- ``inner_type``: inner EtherType or TPID.
+- Default ``mask`` matches the VID part of TCI only (lower 12 bits).
 
 Item: ``IPV4``
 ^^^^^^^^^^^^^^
@@ -866,12 +876,15 @@  Item: ``E_TAG``
 
 Matches an IEEE 802.1BR E-Tag header.
 
-- ``tpid``: tag protocol identifier (0x893F)
+The corresponding standard outer EtherType (TPID) value is 0x893f.
+It can be overridden by the preceding pattern item.
+
 - ``epcp_edei_in_ecid_b``: E-Tag control information (E-TCI), E-PCP (3b),
   E-DEI (1b), ingress E-CID base (12b).
 - ``rsvd_grp_ecid_b``: reserved (2b), GRP (2b), E-CID base (12b).
 - ``in_ecid_e``: ingress E-CID ext.
 - ``ecid_e``: E-CID ext.
+- ``inner_type``: inner EtherType or TPID.
 - Default ``mask`` simultaneously matches GRP and E-CID base.
 
 Item: ``NVGRE``
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 738461f44..25fac8430 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3223,15 +3223,15 @@  This section lists supported pattern items and their attributes, if any.
 
   - ``dst {MAC-48}``: destination MAC.
   - ``src {MAC-48}``: source MAC.
-  - ``type {unsigned}``: EtherType.
+  - ``type {unsigned}``: EtherType or TPID.
 
 - ``vlan``: match 802.1Q/ad VLAN tag.
 
-  - ``tpid {unsigned}``: tag protocol identifier.
   - ``tci {unsigned}``: tag control information.
   - ``pcp {unsigned}``: priority code point.
   - ``dei {unsigned}``: drop eligible indicator.
   - ``vid {unsigned}``: VLAN identifier.
+  - ``inner_type {unsigned}``: inner EtherType or TPID.
 
 - ``ipv4``: match IPv4 header.
 
diff --git a/drivers/net/bnxt/bnxt_filter.c b/drivers/net/bnxt/bnxt_filter.c
index 0d735edf6..a97a8dd46 100644
--- a/drivers/net/bnxt/bnxt_filter.c
+++ b/drivers/net/bnxt/bnxt_filter.c
@@ -327,6 +327,7 @@  bnxt_validate_and_parse_flow_type(struct bnxt *bp,
 	uint32_t vf = 0;
 	int use_ntuple;
 	uint32_t en = 0;
+	uint32_t en_ethertype;
 	int dflt_vnic;
 
 	use_ntuple = bnxt_filter_type_check(pattern, error);
@@ -336,6 +337,9 @@  bnxt_validate_and_parse_flow_type(struct bnxt *bp,
 
 	filter->filter_type = use_ntuple ?
 		HWRM_CFA_NTUPLE_FILTER : HWRM_CFA_EM_FILTER;
+	en_ethertype = use_ntuple ?
+		NTUPLE_FLTR_ALLOC_INPUT_EN_ETHERTYPE :
+		EM_FLOW_ALLOC_INPUT_EN_ETHERTYPE;
 
 	while (item->type != RTE_FLOW_ITEM_TYPE_END) {
 		if (item->last) {
@@ -405,30 +409,52 @@  bnxt_validate_and_parse_flow_type(struct bnxt *bp,
 			if (eth_mask->type) {
 				filter->ethertype =
 					rte_be_to_cpu_16(eth_spec->type);
-				en |= use_ntuple ?
-					NTUPLE_FLTR_ALLOC_INPUT_EN_ETHERTYPE :
-					EM_FLOW_ALLOC_INPUT_EN_ETHERTYPE;
+				en |= en_ethertype;
 			}
 
 			break;
 		case RTE_FLOW_ITEM_TYPE_VLAN:
 			vlan_spec = item->spec;
 			vlan_mask = item->mask;
+			if (en & en_ethertype &&
+			    filter->ethertype != RTE_BE16(0x8100)) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "unsupported outer TPID");
+				return -rte_errno;
+			}
 			if (vlan_mask->tci &&
-			    vlan_mask->tci == RTE_BE16(0x0fff) &&
-			    !vlan_mask->tpid) {
+			    vlan_mask->tci == RTE_BE16(0x0fff)) {
 				/* Only the VLAN ID can be matched. */
 				filter->l2_ovlan =
 					rte_be_to_cpu_16(vlan_spec->tci &
 							 RTE_BE16(0x0fff));
 				en |= EM_FLOW_ALLOC_INPUT_EN_OVLAN_VID;
-			} else if (vlan_mask->tci || vlan_mask->tpid) {
+			} else if (vlan_mask->tci) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   item,
 						   "VLAN mask is invalid");
 				return -rte_errno;
 			}
+			if (vlan_mask->inner_type &&
+			    vlan_mask->inner_type != RTE_BE16(0xffff)) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "inner ethertype mask not"
+						   " valid");
+				return -rte_errno;
+			}
+			if (vlan_mask->inner_type) {
+				filter->ethertype =
+					rte_be_to_cpu_16(vlan_spec->inner_type);
+				en |= en_ethertype;
+			} else {
+				filter->ethertype = RTE_BE16(0x0000);
+				en &= ~en_ethertype;
+			}
 
 			break;
 		case RTE_FLOW_ITEM_TYPE_IPV4:
diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
index c5c98b870..a413740ab 100644
--- a/drivers/net/enic/enic_flow.c
+++ b/drivers/net/enic/enic_flow.c
@@ -4,6 +4,7 @@ 
 
 #include <errno.h>
 #include <stdint.h>
+#include <rte_byteorder.h>
 #include <rte_log.h>
 #include <rte_ethdev_driver.h>
 #include <rte_flow_driver.h>
@@ -545,16 +546,22 @@  enic_copy_item_vlan_v2(const struct rte_flow_item *item,
 	if (!spec)
 		return 0;
 
-	/* Don't support filtering in tpid */
-	if (mask) {
-		if (mask->tpid != 0)
-			return ENOTSUP;
-	} else {
+	if (!mask)
 		mask = &rte_flow_item_vlan_mask;
-		RTE_ASSERT(mask->tpid == 0);
-	}
 
 	if (*inner_ofst == 0) {
+		struct ether_hdr *eth_mask =
+			(void *)gp->layer[FILTER_GENERIC_1_L2].mask;
+		struct ether_hdr *eth_val =
+			(void *)gp->layer[FILTER_GENERIC_1_L2].val;
+
+		/* Exactly one TPID value is allowed if specified */
+		if ((eth_val->ether_type & eth_mask->ether_type) !=
+		    (RTE_BE16(0x8100) & eth_mask->ether_type))
+			return ENOTSUP;
+		eth_mask->ether_type = mask->inner_type;
+		eth_val->ether_type = spec->inner_type;
+
 		/* Outer header. Use the vlan mask/val fields */
 		gp->mask_vlan = mask->tci;
 		gp->val_vlan = spec->tci;
diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/i40e/i40e_flow.c
index 1b336df74..c6dd889ad 100644
--- a/drivers/net/i40e/i40e_flow.c
+++ b/drivers/net/i40e/i40e_flow.c
@@ -2490,24 +2490,36 @@  i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
 						      "Invalid MAC_addr mask.");
 					return -rte_errno;
 				}
+			}
+			if (eth_spec && eth_mask && eth_mask->type) {
+				enum rte_flow_item_type next = (item + 1)->type;
 
-				if ((eth_mask->type & UINT16_MAX) ==
-				    UINT16_MAX) {
-					input_set |= I40E_INSET_LAST_ETHER_TYPE;
-					filter->input.flow.l2_flow.ether_type =
-						eth_spec->type;
+				if (eth_mask->type != RTE_BE16(0xffff)) {
+					rte_flow_error_set(error, EINVAL,
+						      RTE_FLOW_ERROR_TYPE_ITEM,
+						      item,
+						      "Invalid type mask.");
+					return -rte_errno;
 				}
 
 				ether_type = rte_be_to_cpu_16(eth_spec->type);
-				if (ether_type == ETHER_TYPE_IPv4 ||
-				    ether_type == ETHER_TYPE_IPv6 ||
-				    ether_type == ETHER_TYPE_ARP ||
-				    ether_type == outer_tpid) {
+
+				if ((next == RTE_FLOW_ITEM_TYPE_VLAN &&
+				     ether_type != outer_tpid) ||
+				    (next != RTE_FLOW_ITEM_TYPE_VLAN &&
+				     (ether_type == ETHER_TYPE_IPv4 ||
+				      ether_type == ETHER_TYPE_IPv6 ||
+				      ether_type == ETHER_TYPE_ARP ||
+				      ether_type == outer_tpid))) {
 					rte_flow_error_set(error, EINVAL,
 						     RTE_FLOW_ERROR_TYPE_ITEM,
 						     item,
 						     "Unsupported ether_type.");
 					return -rte_errno;
+				} else if (next != RTE_FLOW_ITEM_TYPE_VLAN) {
+					input_set |= I40E_INSET_LAST_ETHER_TYPE;
+					filter->input.flow.l2_flow.ether_type =
+						eth_spec->type;
 				}
 			}
 
@@ -2518,6 +2530,14 @@  i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
 		case RTE_FLOW_ITEM_TYPE_VLAN:
 			vlan_spec = item->spec;
 			vlan_mask = item->mask;
+
+			if (input_set & I40E_INSET_LAST_ETHER_TYPE) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Unsupported outer TPID.");
+				return -rte_errno;
+			}
 			if (vlan_spec && vlan_mask) {
 				if (vlan_mask->tci ==
 				    rte_cpu_to_be_16(I40E_TCI_MASK)) {
@@ -2526,6 +2546,33 @@  i40e_flow_parse_fdir_pattern(struct rte_eth_dev *dev,
 						vlan_spec->tci;
 				}
 			}
+			if (vlan_spec && vlan_mask && vlan_mask->inner_type) {
+				if (vlan_mask->inner_type != RTE_BE16(0xffff)) {
+					rte_flow_error_set(error, EINVAL,
+						      RTE_FLOW_ERROR_TYPE_ITEM,
+						      item,
+						      "Invalid inner_type"
+						      " mask.");
+					return -rte_errno;
+				}
+
+				ether_type =
+					rte_be_to_cpu_16(vlan_spec->inner_type);
+
+				if (ether_type == ETHER_TYPE_IPv4 ||
+				    ether_type == ETHER_TYPE_IPv6 ||
+				    ether_type == ETHER_TYPE_ARP ||
+				    ether_type == outer_tpid) {
+					rte_flow_error_set(error, EINVAL,
+						     RTE_FLOW_ERROR_TYPE_ITEM,
+						     item,
+						     "Unsupported inner_type.");
+					return -rte_errno;
+				}
+				input_set |= I40E_INSET_LAST_ETHER_TYPE;
+				filter->input.flow.l2_flow.ether_type =
+					vlan_spec->inner_type;
+			}
 
 			pctype = I40E_FILTER_PCTYPE_L2_PAYLOAD;
 			layer_idx = I40E_FLXPLD_L2_IDX;
@@ -3284,7 +3331,8 @@  i40e_flow_parse_vxlan_pattern(__rte_unused struct rte_eth_dev *dev,
 		case RTE_FLOW_ITEM_TYPE_VLAN:
 			vlan_spec = item->spec;
 			vlan_mask = item->mask;
-			if (!(vlan_spec && vlan_mask)) {
+			if (!(vlan_spec && vlan_mask) ||
+			    vlan_mask->inner_type) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   item,
@@ -3514,7 +3562,8 @@  i40e_flow_parse_nvgre_pattern(__rte_unused struct rte_eth_dev *dev,
 		case RTE_FLOW_ITEM_TYPE_VLAN:
 			vlan_spec = item->spec;
 			vlan_mask = item->mask;
-			if (!(vlan_spec && vlan_mask)) {
+			if (!(vlan_spec && vlan_mask) ||
+			    vlan_mask->inner_type) {
 				rte_flow_error_set(error, EINVAL,
 						   RTE_FLOW_ERROR_TYPE_ITEM,
 						   item,
@@ -4022,7 +4071,8 @@  i40e_flow_parse_qinq_pattern(__rte_unused struct rte_eth_dev *dev,
 			vlan_spec = item->spec;
 			vlan_mask = item->mask;
 
-			if (!(vlan_spec && vlan_mask)) {
+			if (!(vlan_spec && vlan_mask) ||
+			    vlan_mask->inner_type) {
 				rte_flow_error_set(error, EINVAL,
 					   RTE_FLOW_ERROR_TYPE_ITEM,
 					   item,
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index bc1176819..f03413d32 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -17,6 +17,7 @@ 
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#include <rte_byteorder.h>
 #include <rte_common.h>
 #include <rte_eth_ctrl.h>
 #include <rte_ethdev_driver.h>
@@ -306,6 +307,7 @@  static const struct mlx5_flow_items mlx5_flow_items[] = {
 		.actions = valid_actions,
 		.mask = &(const struct rte_flow_item_vlan){
 			.tci = -1,
+			.inner_type = -1,
 		},
 		.default_mask = &rte_flow_item_vlan_mask,
 		.mask_sz = sizeof(struct rte_flow_item_vlan),
@@ -1285,6 +1287,7 @@  mlx5_flow_create_vlan(const struct rte_flow_item *item,
 	struct mlx5_flow_parse *parser = data->parser;
 	struct ibv_flow_spec_eth *eth;
 	const unsigned int eth_size = sizeof(struct ibv_flow_spec_eth);
+	const char *msg = "VLAN cannot be empty";
 
 	if (spec) {
 		unsigned int i;
@@ -1306,12 +1309,21 @@  mlx5_flow_create_vlan(const struct rte_flow_item *item,
 			 */
 			if (!eth->mask.vlan_tag)
 				goto error;
+			/* Exactly one TPID value is allowed if specified. */
+			if ((eth->val.ether_type & eth->mask.ether_type) !=
+			    (RTE_BE16(0x8100) & eth->mask.ether_type)) {
+				msg = "unsupported outer TPID";
+				goto error;
+			}
+			eth->val.ether_type = spec->inner_type;
+			eth->mask.ether_type = mask->inner_type;
+			eth->val.ether_type &= eth->mask.ether_type;
 		}
 		return 0;
 	}
 error:
 	return rte_flow_error_set(data->error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
-				  item, "VLAN cannot be empty");
+				  item, msg);
 }
 
 /**
diff --git a/drivers/net/mvpp2/mrvl_flow.c b/drivers/net/mvpp2/mrvl_flow.c
index 8fd4dbfb1..6604a411f 100644
--- a/drivers/net/mvpp2/mrvl_flow.c
+++ b/drivers/net/mvpp2/mrvl_flow.c
@@ -1091,12 +1091,6 @@  mrvl_parse_vlan(const struct rte_flow_item *item,
 	if (ret)
 		return ret;
 
-	if (mask->tpid) {
-		rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
-				   NULL, "Not supported by classifier\n");
-		return -rte_errno;
-	}
-
 	m = rte_be_to_cpu_16(mask->tci);
 	if (m & MRVL_VLAN_ID_MASK) {
 		RTE_LOG(WARNING, PMD, "vlan id mask is ignored\n");
@@ -1112,6 +1106,27 @@  mrvl_parse_vlan(const struct rte_flow_item *item,
 			goto out;
 	}
 
+	if (flow->pattern & F_TYPE) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "outer TPID cannot be explicitly matched"
+				   " when VLAN item is also specified\n");
+		return -rte_errno;
+	}
+	if (mask->inner_type) {
+		struct rte_flow_item_eth spec_eth = {
+			.type = spec->inner_type,
+		};
+		struct rte_flow_item_eth mask_eth = {
+			.type = mask->inner_type,
+		};
+
+		RTE_LOG(WARNING, PMD, "inner eth type mask is ignored\n");
+		ret = mrvl_parse_type(spec_eth, mask_eth, flow);
+		if (ret)
+			goto out;
+	}
+
 	return 0;
 out:
 	rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index bf9609735..515a61325 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -352,6 +352,7 @@  sfc_flow_parse_vlan(const struct rte_flow_item *item,
 	const struct rte_flow_item_vlan *mask = NULL;
 	const struct rte_flow_item_vlan supp_mask = {
 		.tci = rte_cpu_to_be_16(ETH_VLAN_ID_MAX),
+		.inner_type = RTE_BE16(0xffff),
 	};
 
 	rc = sfc_flow_parse_init(item,
@@ -394,6 +395,32 @@  sfc_flow_parse_vlan(const struct rte_flow_item *item,
 		return -rte_errno;
 	}
 
+	/*
+	 * If an EtherType was already specified, make sure it is a valid
+	 * TPID for the current VLAN layer before overwriting it with the
+	 * specified inner type.
+	 */
+	if (efx_spec->efs_match_flags & EFX_FILTER_MATCH_ETHER_TYPE &&
+	    efx_spec->efs_ether_type != RTE_BE16(0x8100) &&
+	    efx_spec->efs_ether_type != RTE_BE16(0x88a8)) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "Unsupported outer TPID");
+		return -rte_errno;
+	}
+	if (!mask->inner_type) {
+		efx_spec->efs_match_flags &= ~EFX_FILTER_MATCH_ETHER_TYPE;
+		efx_spec->efs_ether_type = RTE_BE16(0x0000);
+	} else if (mask->inner_type == supp_mask.inner_type) {
+		efx_spec->efs_match_flags |= EFX_FILTER_MATCH_ETHER_TYPE;
+		efx_spec->efs_ether_type = rte_be_to_cpu_16(spec->inner_type);
+	} else {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "Bad mask for VLAN inner_type");
+		return -rte_errno;
+	}
+
 	return 0;
 }
 
diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
index e5eb50fc5..e53eff6ce 100644
--- a/drivers/net/tap/tap_flow.c
+++ b/drivers/net/tap/tap_flow.c
@@ -270,13 +270,13 @@  static const struct tap_flow_items tap_flow_items[] = {
 		.items = ITEMS(RTE_FLOW_ITEM_TYPE_IPV4,
 			       RTE_FLOW_ITEM_TYPE_IPV6),
 		.mask = &(const struct rte_flow_item_vlan){
-			.tpid = -1,
 			/* DEI matching is not supported */
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
 			.tci = 0xffef,
 #else
 			.tci = 0xefff,
 #endif
+			.inner_type = -1,
 		},
 		.mask_sz = sizeof(struct rte_flow_item_vlan),
 		.default_mask = &rte_flow_item_vlan_mask,
@@ -578,13 +578,21 @@  tap_flow_create_vlan(const struct rte_flow_item *item, void *data)
 	/* use default mask if none provided */
 	if (!mask)
 		mask = tap_flow_items[RTE_FLOW_ITEM_TYPE_VLAN].default_mask;
-	/* TC does not support tpid masking. Only accept if exact match. */
-	if (mask->tpid && mask->tpid != 0xffff)
+	/* check that previous eth type is compatible with VLAN */
+	if (info->eth_type && info->eth_type != RTE_BE16(ETH_P_8021Q))
 		return -1;
 	/* Double-tagging not supported. */
-	if (spec && mask->tpid && spec->tpid != htons(ETH_P_8021Q))
+	if (info->vlan)
 		return -1;
 	info->vlan = 1;
+	if (mask->inner_type) {
+		/* TC does not support partial eth_type masking */
+		if (mask->inner_type != RTE_BE16(0xffff))
+			return -1;
+		info->eth_type = spec->inner_type;
+	} else {
+		info->eth_type = 0;
+	}
 	if (!flow)
 		return 0;
 	msg = &flow->msg;
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index 1b222ba60..15e383f95 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -454,11 +454,17 @@  static const struct rte_flow_item_raw rte_flow_item_raw_mask = {
  * RTE_FLOW_ITEM_TYPE_ETH
  *
  * Matches an Ethernet header.
+ *
+ * The @p type field either stands for "EtherType" or "TPID" when followed
+ * by so-called layer 2.5 pattern items such as RTE_FLOW_ITEM_TYPE_VLAN. In
+ * the latter case, @p type refers to that of the outer header, with the
+ * inner EtherType/TPID provided by the subsequent pattern item. This is the
+ * same order as on the wire.
  */
 struct rte_flow_item_eth {
 	struct ether_addr dst; /**< Destination MAC. */
 	struct ether_addr src; /**< Source MAC. */
-	rte_be16_t type; /**< EtherType. */
+	rte_be16_t type; /**< EtherType or TPID. */
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_ETH. */
@@ -475,19 +481,20 @@  static const struct rte_flow_item_eth rte_flow_item_eth_mask = {
  *
  * Matches an 802.1Q/ad VLAN tag.
  *
- * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
- * RTE_FLOW_ITEM_TYPE_VLAN.
+ * The corresponding standard outer EtherType (TPID) values are 0x8100 or
+ * 0x88a8 (in case of outer QinQ). It can be overridden by the preceding
+ * pattern item.
  */
 struct rte_flow_item_vlan {
-	rte_be16_t tpid; /**< Tag protocol identifier. */
 	rte_be16_t tci; /**< Tag control information. */
+	rte_be16_t inner_type; /**< Inner EtherType or TPID. */
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_VLAN. */
 #ifndef __cplusplus
 static const struct rte_flow_item_vlan rte_flow_item_vlan_mask = {
-	.tpid = RTE_BE16(0x0000),
-	.tci = RTE_BE16(0xffff),
+	.tci = RTE_BE16(0x0fff),
+	.inner_type = RTE_BE16(0x0000),
 };
 #endif
 
@@ -636,9 +643,11 @@  static const struct rte_flow_item_vxlan rte_flow_item_vxlan_mask = {
  * RTE_FLOW_ITEM_TYPE_E_TAG.
  *
  * Matches a E-tag header.
+ *
+ * The corresponding standard outer EtherType (TPID) value is 0x893f.
+ * It can be overridden by the preceding pattern item.
  */
 struct rte_flow_item_e_tag {
-	rte_be16_t tpid; /**< Tag protocol identifier (0x893F). */
 	/**
 	 * E-Tag control information (E-TCI).
 	 * E-PCP (3b), E-DEI (1b), ingress E-CID base (12b).
@@ -648,6 +657,7 @@  struct rte_flow_item_e_tag {
 	rte_be16_t rsvd_grp_ecid_b;
 	uint8_t in_ecid_e; /**< Ingress E-CID ext. */
 	uint8_t ecid_e; /**< E-CID ext. */
+	rte_be16_t inner_type; /**< Inner EtherType or TPID. */
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_E_TAG. */