[dpdk-users] Issues with ixgbe and rte_flow

Lu, Wenzhuo wenzhuo.lu at intel.com
Thu Mar 9 02:59:48 CET 2017


Hi Le Scouarnec,

> -----Original Message-----
> From: Le Scouarnec Nicolas [mailto:Nicolas.LeScouarnec at technicolor.com]
> Sent: Thursday, March 9, 2017 12:18 AM
> To: Adrien Mazarguil
> Cc: Lu, Wenzhuo; users; Jan Medala; Evgeny Schemeilin; Stephen Hurd; Jerin
> Jacob; Rahul Lakkireddy; John Daley; Matej Vido; Zhang, Helin; Ananyev,
> Konstantin; Wu, Jingjing; Chen, Jing D; Alejandro Lucero; Harish Patil; Rasesh
> Mody; Andrew Rybchenko; Nelio Laranjeiro; Vasily Philipov; Pascal Mazon;
> Thomas Monjalon
> Subject: Re: Issues with ixgbe and rte_flow
> 
> Removing dev to avoid flooding too much people.
> 
> Talking about API/ABI breakage, PMD maintainers / users might be interested by
> this comment that got stripped from the initial message: I also have a side
> comment which might be more related to the general rte_flow API than to the
> specific implementation in ixgbe. I don't know if it is specific to ixgbe's
> implementation but, as a user, the rte_flow_error returned was not very useful
> for it does return the error of the last tried filter-type (L2 tunnel in ixgbe), and
> not the error of the filter-type that my setup should use (flow director). I had to
> change the order in which filters are tried in ixgbe code to get a more useful
> error message. As NICs can have several filter-types, It would be be useful to the
> user if rte_flow_validate/create could return the errors for all filter types tried
> although that would require to change the rte_flow API and returning an array
> of rte_flow_error and not a single struct.
It's a good suggestion.  I remember we have some discussion about how to feedback the error to the APP. I think the reason why we don't make it too complex because it's the first step of generic API.  Now we see some feedback from the users, we can keep optimizing it :)

And about the tpid, ethertype. I have a thought that why we need it as it's duplicate with the item type. I think the initial design is just following the IEEE spec to define the structures so we will not miss anything. But why not do some optimization. For VLAN the tpid must be 0x8100, for IPv4, the ethertype must be 0x0800. So why bothering let APP provide them and driver check them? Seems we can just remove these fields from the structures, it can make things simpler.

Adrien, as you're the maintainer of rte_flow, any thought about these ideas? Thanks.

> 
> 
> Best regards,
> Nicolas Le Scouarnec
> 
> 
> From: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
> Sent: Wednesday, March 8, 2017 4:41 PM
> To: Le Scouarnec Nicolas
> Cc: Lu, Wenzhuo; dev; users; Jan Medala; Evgeny Schemeilin; Stephen Hurd;
> Jerin Jacob; Rahul Lakkireddy; John Daley; Matej Vido; Helin Zhang; Konstantin
> Ananyev; Jingjing Wu; Jing Chen; Alejandro Lucero; Harish Patil; Rasesh Mody;
> Andrew Rybchenko; Nelio  Laranjeiro; Vasily Philipov; Pascal Mazon; Thomas
> Monjalon
> Subject: Re: Issues with ixgbe and rte_flow
> 
> ** WARNING: This mail is from an external source **
> 
> 
> CC'ing users at dpdk.org since this issue primarily affects rte_flow users, and
> several PMD maintainers to get their opinion on the matter, see below.
> 
> On Wed, Mar 08, 2017 at 09:24:26AM +0000, Le Scouarnec Nicolas wrote:
> > My response is inline bellow, and further comment on the code excerpt
> > also
> >
> >
> > From: Lu, Wenzhuo <wenzhuo.lu at intel.com>
> > Sent: Wednesday, March 8, 2017 4:16 AM
> > To: Le Scouarnec Nicolas; dev at dpdk.org; Adrien Mazarguil
> > (adrien.mazarguil at 6wind.com)
> > Cc: Yigit, Ferruh
> > Subject: RE: Issues with ixgbe and rte_flow
> >
> > >> I have been using the new API rte_flow to program filtering on an
> > >> X540 (ixgbe) NIC. My goal is to send packets from different VLANs
> > >> to different queues (filtering which should be supported by flow
> > >> director as far as I understand). I enclosed the setup code at the bottom of
> this email.
> > >> For reference, here is the setup code I use
> > >>
> > >>       vlan_spec.tci = vlan_be;
> > >>       vlan_spec.tpid = 0;
> > >>
> > >>       vlan_mask.tci = rte_cpu_to_be_16(0x0fff);
> > >>       vlan_mask.tpid =  0;
> >
> > >To my opinion, this setting is not right. As we know, vlan tag is inserted
> between MAC source address and Ether type.
> > >So if we have a MAC+VLAN+IPv4 packet, the vlan_spec.tpid should be 0x8100,
> the eth_spec.type should be 0x0800.
> > >+ Adrien, the author. He can correct me if I'm wrong.
> 
> That's right, however the confusion is understandable, perhaps the
> documentation should be clearer. It currently states what follows without
> describing the reason:
> 
>  /**
>   * RTE_FLOW_ITEM_TYPE_VLAN
>   *
>   * Matches an 802.1Q/ad VLAN tag.
>   *
>   * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
>   * RTE_FLOW_ITEM_TYPE_VLAN.
>   */
> 
> > Ok, I apologize, you're right. Being more used to the software-side than to the
> hardware-side, I misunderstood struct rte_flow_item_vlan and though it was the
> "equivalent" of struct vlan_hdr, in which case the vlan_hdr contains the type of
> the encapsulated  frame.
> >
> > (  /**
> >  * Ethernet VLAN Header.
> >  * Contains the 16-bit VLAN Tag Control Identifier and the Ethernet
> >type
> >  * of the encapsulated frame.
> >  */
> > struct vlan_hdr {
> >       uint16_t vlan_tci; /**< Priority (3) + CFI (1) + Identifier Code
> >(12) */
> >       uint16_t eth_proto;/**< Ethernet type of encapsulated frame. */
> >} __attribute__((__packed__));        )
> 
> Indeed, struct vlan_hdr and struct rte_flow_item_vlan are not mapped at the
> same offset; the former includes EtherType of the inner packet (eth_proto),
> while the latter describes the inserted VLAN header itself starting with TPID.
> 
> This approach was chosen for rte_flow for consistency with the fact each
> pattern item describes exactly one protocol header, even though in the case of
> VLAN and other layer 2.5 protocols, some happen to be embedded.
> IPv4/IPv6 options will be provided as separate items in a similar fashion.
> 
> It also allows adding/removing VLAN tags to an existing rule without modifying
> the EtherType of the inner frame.
> 
> Now assuming you're not the only one facing that issue, if the current definition
> does not make sense, perhaps we can update the API before it's too late. I'll
> attempt to summarize it with an example below.
> 
> In any case, matching nonspecific VLAN-tagged and QinQ UDPv4 packets in
> testpmd is written as:
> 
>  flow create 0 pattern eth / vlan / ipv4 / udp / end actions queue 1 / end
>  flow create 0 pattern eth / vlan / vlan / ipv4 / udp / end actions queue 1 / end
> 
> However, with the current API described above, specifying inner/outer
> EtherTypes for the above packets yields (as a reminder, 0x8100 stands for VLAN,
> 0x8000 for IPv4 and 0x88A8 for QinQ):
> 
> #1
> 
>  flow create 0 pattern eth type is 0x8000 / vlan tpid is 0x8100 / ipv4 / udp /
> actions queue 1 / end
>  flow create 0 pattern eth type is 0x8000 / vlan tpid is 0x88A8 / vlan tpid is
> 0x8100 / ipv4 / udp / actions queue 1 / end
> 
> Instead of the arguably more accurate (renaming "tpid" to "inner_type" for
> clarity):
> 
> #2
> 
>  flow create 0 pattern eth type is 0x8100 / vlan type is 0x8000 / ipv4 / udp /
> actions queue 1 / end
>  flow create 0 pattern eth type is 0x88A8 / vlan inner_type is 0x8100 / vlan
> inner_type is 0x8000 / ipv4 / udp / actions queue 1 / end
> 
> So, should the VLAN item be updated to behave as described in #2?
> 
> Note: doing so will cause a serious API/ABI breakage, I know it was not
> supposed to happen according to the rte_flow sales pitch, but hey.
> 
> --
> Adrien Mazarguil
> 6WIND
> 


More information about the users mailing list