[RFC] ethdev: add generic L2/L3 tunnel encapsulation actions

Message ID 1532967565-13962-1-git-send-email-orika@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [RFC] ethdev: add generic L2/L3 tunnel encapsulation actions |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Ori Kam July 30, 2018, 4:19 p.m. UTC
  Currenlty the encap/decap actions only support encapsulation
of VXLAN and NVGRE L2 packets.
There is a need to add more L2 tunnels and also L3 tunnels.

One issue with the current approch is the duplication of code.
For example the code for handling NVGRE and VXLAN are exactly the same,
and each new tunnel will have the same exact structure.

Last issue with the current approach is the use of rte_items.
The most significant issue with that is that the PMD needs to convert
the items and this hurts the insertion rate. Other issue is that
the rte_item has 3 members while we only need the spec (last and mask
are useless). I know that the extra member have only small memory
impact but considering that we can have millions of rules, this became
more important consideration, and it is bad practice to add a variable
that is never used.

My suggestion is to create 2 commands, one for encapsulation of L2
packets and one for encapsulation of L3 tunnels.
The parameters for those functions will be a uint8_t buffer with
a length parameter.

The current approach is not implemented yet in drivers yet, and
is marked as experimental, so it should be removed.

Any comments will be hugely appreciated.

Signed-off-by: Ori Kam <orika@mellanox.com>
---
 lib/librte_ethdev/rte_flow.h |  111 ++++++++++++++++++------------------------
 1 files changed, 47 insertions(+), 64 deletions(-)
  

Comments

Stephen Hemminger July 30, 2018, 5:28 p.m. UTC | #1
On Mon, 30 Jul 2018 19:19:25 +0300
Ori Kam <orika@mellanox.com> wrote:

> Currenlty the encap/decap actions only support encapsulation
> of VXLAN and NVGRE L2 packets.
> There is a need to add more L2 tunnels and also L3 tunnels.
> 
> One issue with the current approch is the duplication of code.
> For example the code for handling NVGRE and VXLAN are exactly the same,
> and each new tunnel will have the same exact structure.
> 
> Last issue with the current approach is the use of rte_items.
> The most significant issue with that is that the PMD needs to convert
> the items and this hurts the insertion rate. Other issue is that
> the rte_item has 3 members while we only need the spec (last and mask
> are useless). I know that the extra member have only small memory
> impact but considering that we can have millions of rules, this became
> more important consideration, and it is bad practice to add a variable
> that is never used.
> 
> My suggestion is to create 2 commands, one for encapsulation of L2
> packets and one for encapsulation of L3 tunnels.
> The parameters for those functions will be a uint8_t buffer with
> a length parameter.
> 
> The current approach is not implemented yet in drivers yet, and
> is marked as experimental, so it should be removed.
> 
> Any comments will be hugely appreciated.
> 
> Signed-off-by: Ori Kam <orika@mellanox.com>

What about binary and source compatibilities with older release?
  
Ori Kam July 30, 2018, 6:02 p.m. UTC | #2
> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Monday, July 30, 2018 8:29 PM
> To: Ori Kam <orika@mellanox.com>
> Cc: Xueming(Steven) Li <xuemingl@mellanox.com>; Dekel Peled
> <dekelp@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>; Adrien
> Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>;
> ferruh.yigit@intel.com; arybchenko@solarflare.com; dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC] ethdev: add generic L2/L3 tunnel
> encapsulation actions
> 
> On Mon, 30 Jul 2018 19:19:25 +0300
> Ori Kam <orika@mellanox.com> wrote:
> 
> > Currenlty the encap/decap actions only support encapsulation
> > of VXLAN and NVGRE L2 packets.
> > There is a need to add more L2 tunnels and also L3 tunnels.
> >
> > One issue with the current approch is the duplication of code.
> > For example the code for handling NVGRE and VXLAN are exactly the
> same,
> > and each new tunnel will have the same exact structure.
> >
> > Last issue with the current approach is the use of rte_items.
> > The most significant issue with that is that the PMD needs to convert
> > the items and this hurts the insertion rate. Other issue is that
> > the rte_item has 3 members while we only need the spec (last and mask
> > are useless). I know that the extra member have only small memory
> > impact but considering that we can have millions of rules, this became
> > more important consideration, and it is bad practice to add a variable
> > that is never used.
> >
> > My suggestion is to create 2 commands, one for encapsulation of L2
> > packets and one for encapsulation of L3 tunnels.
> > The parameters for those functions will be a uint8_t buffer with
> > a length parameter.
> >
> > The current approach is not implemented yet in drivers yet, and
> > is marked as experimental, so it should be removed.
> >
> > Any comments will be hugely appreciated.
> >
> > Signed-off-by: Ori Kam <orika@mellanox.com>
> 
> What about binary and source compatibilities with older release?

I'm not sure what you mean, currently this feature is not implemented 
In any PMD (as far as I can see) so no one uses it, and it is marked as
experimental. In any case if this is an issue we can keep the old one and just 
add the new one. 

Best,
Ori
  
Ori Kam Aug. 22, 2018, 5:57 a.m. UTC | #3
Hi all,

Just looking for more comments if any 😊

Best,

Ori

> -----Original Message-----
> From: Ori Kam
> Sent: Monday, July 30, 2018 9:03 PM
> To: 'Stephen Hemminger' <stephen@networkplumber.org>
> Cc: Xueming(Steven) Li <xuemingl@mellanox.com>; Dekel Peled
> <dekelp@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>; Adrien
> Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>;
> ferruh.yigit@intel.com; arybchenko@solarflare.com; dev@dpdk.org
> Subject: RE: [dpdk-dev] [RFC] ethdev: add generic L2/L3 tunnel
> encapsulation actions
> 
> 
> 
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Monday, July 30, 2018 8:29 PM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Xueming(Steven) Li <xuemingl@mellanox.com>; Dekel Peled
> > <dekelp@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> Adrien
> > Mazarguil <adrien.mazarguil@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Yongseok Koh <yskoh@mellanox.com>;
> > ferruh.yigit@intel.com; arybchenko@solarflare.com; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [RFC] ethdev: add generic L2/L3 tunnel
> > encapsulation actions
> >
> > On Mon, 30 Jul 2018 19:19:25 +0300
> > Ori Kam <orika@mellanox.com> wrote:
> >
> > > Currenlty the encap/decap actions only support encapsulation
> > > of VXLAN and NVGRE L2 packets.
> > > There is a need to add more L2 tunnels and also L3 tunnels.
> > >
> > > One issue with the current approch is the duplication of code.
> > > For example the code for handling NVGRE and VXLAN are exactly the
> > same,
> > > and each new tunnel will have the same exact structure.
> > >
> > > Last issue with the current approach is the use of rte_items.
> > > The most significant issue with that is that the PMD needs to convert
> > > the items and this hurts the insertion rate. Other issue is that
> > > the rte_item has 3 members while we only need the spec (last and mask
> > > are useless). I know that the extra member have only small memory
> > > impact but considering that we can have millions of rules, this became
> > > more important consideration, and it is bad practice to add a variable
> > > that is never used.
> > >
> > > My suggestion is to create 2 commands, one for encapsulation of L2
> > > packets and one for encapsulation of L3 tunnels.
> > > The parameters for those functions will be a uint8_t buffer with
> > > a length parameter.
> > >
> > > The current approach is not implemented yet in drivers yet, and
> > > is marked as experimental, so it should be removed.
> > >
> > > Any comments will be hugely appreciated.
> > >
> > > Signed-off-by: Ori Kam <orika@mellanox.com>
> >
> > What about binary and source compatibilities with older release?
> 
> I'm not sure what you mean, currently this feature is not implemented
> In any PMD (as far as I can see) so no one uses it, and it is marked as
> experimental. In any case if this is an issue we can keep the old one and just
> add the new one.
> 
> Best,
> Ori
  
Ferruh Yigit Aug. 23, 2018, 12:12 p.m. UTC | #4
On 7/30/2018 5:19 PM, Ori Kam wrote:
> Currenlty the encap/decap actions only support encapsulation
> of VXLAN and NVGRE L2 packets.
> There is a need to add more L2 tunnels and also L3 tunnels.
> 
> One issue with the current approch is the duplication of code.
> For example the code for handling NVGRE and VXLAN are exactly the same,
> and each new tunnel will have the same exact structure.
> 
> Last issue with the current approach is the use of rte_items.
> The most significant issue with that is that the PMD needs to convert
> the items and this hurts the insertion rate. Other issue is that
> the rte_item has 3 members while we only need the spec (last and mask
> are useless). I know that the extra member have only small memory
> impact but considering that we can have millions of rules, this became
> more important consideration, and it is bad practice to add a variable
> that is never used.
> 
> My suggestion is to create 2 commands, one for encapsulation of L2
> packets and one for encapsulation of L3 tunnels.
> The parameters for those functions will be a uint8_t buffer with
> a length parameter.
> 
> The current approach is not implemented yet in drivers yet, and
> is marked as experimental, so it should be removed.
> 
> Any comments will be hugely appreciated.
> 
> Signed-off-by: Ori Kam <orika@mellanox.com>

Hi Adrien, Awal, Declan,

Any comment on the RFC?

Thanks,
ferruh
  

Patch

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f8ba71c..3549d7d 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1473,38 +1473,37 @@  enum rte_flow_action_type {
 	RTE_FLOW_ACTION_TYPE_OF_PUSH_MPLS,
 
 	/**
-	 * Encapsulate flow in VXLAN tunnel as defined in
-	 * rte_flow_action_vxlan_encap action structure.
+	 * Encapsulate flow with header tunnel as defined in
+	 * rte_flow_action_tunnel_encap action structure.
 	 *
-	 * See struct rte_flow_action_vxlan_encap.
+	 * See struct rte_flow_action_tunnel_encap.
 	 */
-	RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP,
+	RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP,
 
 	/**
-	 * Decapsulate outer most VXLAN tunnel from matched flow.
+	 * Decapsulate outer most tunnel from matched flow.
 	 *
-	 * If flow pattern does not define a valid VXLAN tunnel (as specified by
-	 * RFC7348) then the PMD should return a RTE_FLOW_ERROR_TYPE_ACTION
-	 * error.
+	 * The flow pattern must have a valid tunnel header.
 	 */
-	RTE_FLOW_ACTION_TYPE_VXLAN_DECAP,
+	RTE_FLOW_ACTION_TYPE_TUNNEL_DECAP,
 
 	/**
-	 * Encapsulate flow in NVGRE tunnel defined in the
-	 * rte_flow_action_nvgre_encap action structure.
+	 * Remove L2 header and encapsulate with header tunnel as defined in
+	 * rte_flow_action_tunnel_encap_l3 action structure.
 	 *
-	 * See struct rte_flow_action_nvgre_encap.
+	 * See struct rte_flow_action_tunnel_encap_l3.
 	 */
-	RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP,
+	RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP_L3,
 
 	/**
-	 * Decapsulate outer most NVGRE tunnel from matched flow.
+	 * Decapsulate outer most tunnel from matched flow,
+	 * and encap the remaining header with the given one.
 	 *
-	 * If flow pattern does not define a valid NVGRE tunnel (as specified by
-	 * RFC7637) then the PMD should return a RTE_FLOW_ERROR_TYPE_ACTION
-	 * error.
+	 * The flow pattern must have a valid tunnel header.
+	 *
+	 * See struct rte_flow_action_tunnel_decap_l3.
 	 */
-	RTE_FLOW_ACTION_TYPE_NVGRE_DECAP,
+	RTE_FLOW_ACTION_TYPE_TUNNEL_DECAP_L3,
 };
 
 /**
@@ -1803,69 +1802,53 @@  struct rte_flow_action_of_push_mpls {
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP
- *
- * VXLAN tunnel end-point encapsulation data definition
- *
- * The tunnel definition is provided through the flow item pattern, the
- * provided pattern must conform to RFC7348 for the tunnel specified. The flow
- * definition must be provided in order from the RTE_FLOW_ITEM_TYPE_ETH
- * definition up the end item which is specified by RTE_FLOW_ITEM_TYPE_END.
- *
- * The mask field allows user to specify which fields in the flow item
- * definitions can be ignored and which have valid data and can be used
- * verbatim.
+ * RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP
  *
- * Note: the last field is not used in the definition of a tunnel and can be
- * ignored.
- *
- * Valid flow definition for RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP include:
- *
- * - ETH / IPV4 / UDP / VXLAN / END
- * - ETH / IPV6 / UDP / VXLAN / END
- * - ETH / VLAN / IPV4 / UDP / VXLAN / END
+ * Tunnel end-point encapsulation data definition
  *
+ * The tunnel definition is provided through raw buffer that holds
+ * the headers that should encapsulate the packet.
+ * The given encapsulation should be a valid packet header.
  */
-struct rte_flow_action_vxlan_encap {
-	/**
-	 * Encapsulating vxlan tunnel definition
-	 * (terminated by the END pattern item).
-	 */
-	struct rte_flow_item *definition;
+struct rte_flow_action_tunnel_encap {
+	uint8_t *buf;
+	uint16_t size;
 };
 
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP
+ * RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP_L3
  *
- * NVGRE tunnel end-point encapsulation data definition
+ * Tunnel end-point encapsulation after removing the L2 header,
+ * including the vlan header if any.
  *
- * The tunnel definition is provided through the flow item pattern  the
- * provided pattern must conform with RFC7637. The flow definition must be
- * provided in order from the RTE_FLOW_ITEM_TYPE_ETH definition up the end item
- * which is specified by RTE_FLOW_ITEM_TYPE_END.
+ * The tunnel definition is provided through raw buffer that holds
+ * the headers that should encapsulate the packet.
  *
- * The mask field allows user to specify which fields in the flow item
- * definitions can be ignored and which have valid data and can be used
- * verbatim.
+ * The given encapsulation should be a valid packet header.
+ */
+struct rte_flow_action_tunnel_encap_l3 {
+	uint8_t *buf;
+	uint16_t size;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
  *
- * Note: the last field is not used in the definition of a tunnel and can be
- * ignored.
+ * RTE_FLOW_ACTION_TYPE_TUNNEL_DECAP_L3
  *
- * Valid flow definition for RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP include:
+ * Decap the outer header and add the L2 header based on the given buffer.
  *
- * - ETH / IPV4 / NVGRE / END
- * - ETH / VLAN / IPV6 / NVGRE / END
+ * The flow pattern must have a valid tunnel packet.
  *
+ * The given encapsulation should be a valid L2 packet header.
  */
-struct rte_flow_action_nvgre_encap {
-	/**
-	 * Encapsulating vxlan tunnel definition
-	 * (terminated by the END pattern item).
-	 */
-	struct rte_flow_item *definition;
+struct rte_flow_action_tunnel_decap_l3 {
+	uint8_t *buf;
+	uint16_t size;
 };
 
 /*