[dpdk-users] Issues with ixgbe and rte_flow

Le Scouarnec Nicolas Nicolas.LeScouarnec at technicolor.com
Wed Mar 8 17:17:33 CET 2017


Removing dev to avoid flooding too much people. 

Talking about API/ABI breakage, PMD maintainers / users might be interested by this comment that got stripped from the initial message: I also have a side comment which might be more related to the general rte_flow API than to the specific implementation in ixgbe. I don't know if it is specific to ixgbe's implementation but, as a user, the rte_flow_error returned was not very useful for it does return the error of the last tried filter-type (L2 tunnel in ixgbe), and not the error of the filter-type that my setup should use (flow director). I had to change the order in which filters are tried in ixgbe code to get a more useful error message. As NICs can have several filter-types, It would be be useful to the user if rte_flow_validate/create could return the errors for all filter types tried although that would require to change the rte_flow API and returning an array of rte_flow_error and not a single struct.


Best regards,
Nicolas Le Scouarnec


From: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
Sent: Wednesday, March 8, 2017 4:41 PM
To: Le Scouarnec Nicolas
Cc: Lu, Wenzhuo; dev; users; Jan Medala; Evgeny Schemeilin; Stephen Hurd; Jerin Jacob; Rahul Lakkireddy; John Daley; Matej Vido; Helin Zhang; Konstantin Ananyev; Jingjing Wu; Jing Chen; Alejandro Lucero; Harish Patil; Rasesh Mody; Andrew Rybchenko; Nelio  Laranjeiro; Vasily Philipov; Pascal Mazon; Thomas Monjalon
Subject: Re: Issues with ixgbe and rte_flow
    
** WARNING: This mail is from an external source **


CC'ing users at dpdk.org since this issue primarily affects rte_flow users, and
several PMD maintainers to get their opinion on the matter, see below.

On Wed, Mar 08, 2017 at 09:24:26AM +0000, Le Scouarnec Nicolas wrote:
> My response is inline bellow, and further comment on the code excerpt also
>
>
> From: Lu, Wenzhuo <wenzhuo.lu at intel.com>
> Sent: Wednesday, March 8, 2017 4:16 AM
> To: Le Scouarnec Nicolas; dev at dpdk.org; Adrien Mazarguil (adrien.mazarguil at 6wind.com)
> Cc: Yigit, Ferruh
> Subject: RE: Issues with ixgbe and rte_flow
>
> >> I have been using the new API rte_flow to program filtering on an X540 (ixgbe)
> >> NIC. My goal is to send packets from different VLANs to different queues
> >> (filtering which should be supported by flow director as far as I understand). I
> >> enclosed the setup code at the bottom of this email.
> >> For reference, here is the setup code I use
> >>
> >>       vlan_spec.tci = vlan_be;
> >>       vlan_spec.tpid = 0;
> >>
> >>       vlan_mask.tci = rte_cpu_to_be_16(0x0fff);
> >>       vlan_mask.tpid =  0;
>
> >To my opinion, this setting is not right. As we know, vlan tag is inserted between MAC source address and Ether type.
> >So if we have a MAC+VLAN+IPv4 packet, the vlan_spec.tpid should be 0x8100, the eth_spec.type should be 0x0800.
> >+ Adrien, the author. He can correct me if I'm wrong.

That's right, however the confusion is understandable, perhaps the
documentation should be clearer. It currently states what follows without
describing the reason:

 /**
  * RTE_FLOW_ITEM_TYPE_VLAN
  *
  * Matches an 802.1Q/ad VLAN tag.
  *
  * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
  * RTE_FLOW_ITEM_TYPE_VLAN.
  */

> Ok, I apologize, you're right. Being more used to the software-side than to the hardware-side, I misunderstood struct rte_flow_item_vlan and though it was the "equivalent" of struct vlan_hdr, in which case the vlan_hdr contains the type of the encapsulated  frame.
>
> (  /**
>  * Ethernet VLAN Header.
>  * Contains the 16-bit VLAN Tag Control Identifier and the Ethernet type
>  * of the encapsulated frame.
>  */
> struct vlan_hdr {
>       uint16_t vlan_tci; /**< Priority (3) + CFI (1) + Identifier Code (12) */
>       uint16_t eth_proto;/**< Ethernet type of encapsulated frame. */
> } __attribute__((__packed__));        )

Indeed, struct vlan_hdr and struct rte_flow_item_vlan are not mapped at the
same offset; the former includes EtherType of the inner packet (eth_proto),
while the latter describes the inserted VLAN header itself starting with
TPID.

This approach was chosen for rte_flow for consistency with the fact each
pattern item describes exactly one protocol header, even though in the case
of VLAN and other layer 2.5 protocols, some happen to be embedded.
IPv4/IPv6 options will be provided as separate items in a similar fashion.

It also allows adding/removing VLAN tags to an existing rule without
modifying the EtherType of the inner frame.

Now assuming you're not the only one facing that issue, if the current
definition does not make sense, perhaps we can update the API before it's
too late. I'll attempt to summarize it with an example below.

In any case, matching nonspecific VLAN-tagged and QinQ UDPv4 packets in
testpmd is written as:

 flow create 0 pattern eth / vlan / ipv4 / udp / end actions queue 1 / end
 flow create 0 pattern eth / vlan / vlan / ipv4 / udp / end actions queue 1 / end

However, with the current API described above, specifying inner/outer
EtherTypes for the above packets yields (as a reminder, 0x8100 stands for
VLAN, 0x8000 for IPv4 and 0x88A8 for QinQ):

#1

 flow create 0 pattern eth type is 0x8000 / vlan tpid is 0x8100 / ipv4 / udp / actions queue 1 / end
 flow create 0 pattern eth type is 0x8000 / vlan tpid is 0x88A8 / vlan tpid is 0x8100 / ipv4 / udp / actions queue 1 / end

Instead of the arguably more accurate (renaming "tpid" to "inner_type" for
clarity):

#2

 flow create 0 pattern eth type is 0x8100 / vlan type is 0x8000 / ipv4 / udp / actions queue 1 / end
 flow create 0 pattern eth type is 0x88A8 / vlan inner_type is 0x8100 / vlan inner_type is 0x8000 / ipv4 / udp / actions queue 1 / end

So, should the VLAN item be updated to behave as described in #2?

Note: doing so will cause a serious API/ABI breakage, I know it was not
supposed to happen according to the rte_flow sales pitch, but hey.

--
Adrien Mazarguil
6WIND
    


More information about the users mailing list