[dpdk-dev] [RFC] tunnel endpoint hw acceleration enablement

Doherty, Declan declan.doherty at intel.com
Mon Feb 26 18:44:01 CET 2018


On 13/02/2018 5:05 PM, Adrien Mazarguil wrote:
> Hi,
> 
> Apologies for being late to this thread, I've read the ensuing discussion
> (hope I didn't miss any) and also think rte_flow could be improved in
> several ways to enable TEP support, in particular regarding the ordering of
> actions.
> 
> On the other hand I'm not sure a dedicated API for TEP is needed at all. I'm
> not convinced rte_security chose the right path and would like to avoid
> repeating the same mistakes if possible, more below.
> 
> On Thu, Dec 21, 2017 at 10:21:13PM +0000, Doherty, Declan wrote:
>> This RFC contains a proposal to add a new tunnel endpoint API to DPDK that when used
>> in conjunction with rte_flow enables the configuration of inline data path encapsulation
>> and decapsulation of tunnel endpoint network overlays on accelerated IO devices.
>>
>> The proposed new API would provide for the creation, destruction, and
>> monitoring of a tunnel endpoint in supporting hw, as well as capabilities APIs to allow the
>> acceleration features to be discovered by applications.
>>
>> /** Tunnel Endpoint context, opaque structure */
>> struct rte_tep;
>>
>> enum rte_tep_type {
>>                 RTE_TEP_TYPE_VXLAN = 1, /**< VXLAN Protocol */
>>                 RTE_TEP_TYPE_NVGRE,     /**< NVGRE Protocol */
>>                 ...
>> };
>>
>> /** Tunnel Endpoint Attributes */
>> struct rte_tep_attr {
>>                 enum rte_type_type type;
>>
>>                 /* other endpoint attributes here */
>> }
>>
>> /**
>> * Create a tunnel end-point context as specified by the flow attribute and pattern
>> *
>> * @param   port_id     Port identifier of Ethernet device.
>> * @param   attr        Flow rule attributes.
>> * @param   pattern     Pattern specification by list of rte_flow_items.
>> * @return
>> *  - On success returns pointer to TEP context
>> *  - On failure returns NULL
>> */
>> struct rte_tep *rte_tep_create(uint16_t port_id,
>>                                struct rte_tep_attr *attr, struct rte_flow_item pattern[])
>>
>> /**
>> * Destroy an existing tunnel end-point context. All the end-points context
>> * will be destroyed, so all active flows using tep should be freed before
>> * destroying context.
>> * @param   port_id    Port identifier of Ethernet device.
>> * @param   tep        Tunnel endpoint context
>> * @return
>> *  - On success returns 0
>> *  - On failure returns 1
>> */
>> int rte_tep_destroy(uint16_t port_id, struct rte_tep *tep)
>>
>> /**
>> * Get tunnel endpoint statistics
>> *
>> * @param   port_id    Port identifier of Ethernet device.
>> * @param   tep        Tunnel endpoint context
>> * @param   stats      Tunnel endpoint statistics
>> *
>> * @return
>> *  - On success returns 0
>> *  - On failure returns 1
>> */
>> Int
>> rte_tep_stats_get(uint16_t port_id, struct rte_tep *tep,
>>                                struct rte_tep_stats *stats)
>>
>> /**
>> * Get ports tunnel endpoint capabilities
>> *
>> * @param   port_id    Port identifier of Ethernet device.
>> * @param   capabilities        Tunnel endpoint capabilities
>> *
>> * @return
>> *  - On success returns 0
>> *  - On failure returns 1
>> */
>> int
>> rte_tep_capabilities_get(uint16_t port_id,
>>                                struct rte_tep_capabilities *capabilities)
>>
>>
>> To direct traffic flows to hw terminated tunnel endpoint the rte_flow API is
>> enhanced to add a new flow item type. This contains a pointer to the
>> TEP context as well as the overlay flow id to which the traffic flow is
>> associated.
>>
>> struct rte_flow_item_tep {
>>                 struct rte_tep *tep;
>>                 uint32_t flow_id;
>> }
> 
> What I dislike is rte_flow item/actions relying on externally-generated
> opaque objects when these can be avoided, as it means yet another API
> applications have to deal with and PMDs need to implement; this adds a layer
> of inefficiency in my opinion.
> 
> I believe TEP can be fully implemented through a combination of new rte_flow
> pattern items/actions without involving external API calls. More on that
> later.
> 
>> Also 2 new generic actions types are added encapsulation and decapsulation.
>>
>> RTE_FLOW_ACTION_TYPE_ENCAP
>> RTE_FLOW_ACTION_TYPE_DECAP
>>
>> struct rte_flow_action_encap {
>>                 struct rte_flow_item *item;
>> }
>>
>> struct rte_flow_action_decap {
>>                 struct rte_flow_item *item;
>> }
> 
> Encap/decap actions are definitely needed and useful, no question about
> that. I'm unsure about doing so through a generic action with the described
> structures instead of dedicated ones though.
> 
> These can't work with anything other than rte_flow_item_tep; a special
> pattern item using some kind of opaque object is needed (e.g. using
> rte_flow_item_tcp makes no sense with them).
> 
> Also struct rte_flow_item is tailored for flow rule patterns, using it with
> actions is not only confusing, it makes its "mask" and "last" members
> useless and inconsistent with their documentation.
> 
> Although I'm not convinced an opaque object is the right approach, if we
> choose this route I suggest the much simpler:
> 
>   struct rte_flow_action_tep_(encap|decap) {
>       struct rte_tep *tep;
>       uint32_t flow_id;
>   };
> 

That's a fair point, the only other action that we currently had the 
encap/decap actions supporting was the Ethernet item, and going back to 
a comment from Boris having the Ethernet header separate from the tunnel 
is probably not ideal anyway. As one of our reasons for using an opaque 
tep item was to allow modification of the TEP independently of all the 
flows being carried on it. So for instance if the src or dst MAC needs 
to be modified or the output port needs to changed, the TEP itself could 
be modified.


>> The following section outlines the intended usage of the new APIs and then how
>> they are combined with the existing rte_flow APIs.
>>
>> Tunnel endpoints are created on logical ports which support the capability
>> using rte_tep_create() using a combination of TEP attributes and
>> rte_flow_items. In the example below a new IPv4 VxLAN endpoint is being defined.
>> The attrs parameter sets the TEP type, and could be used for other possible
>> attributes.
>>
>> struct rte_tep_attr attrs = { .type = RTE_TEP_TYPE_VXLAN };
>>
>> The values for the headers which make up the tunnel endpointr are then
>> defined using spec parameter in the rte flow items (IPv4, UDP and
>> VxLAN in this case)
>>
>> struct rte_flow_item_ipv4 ipv4_item = {
>>                 .hdr = { .src_addr = saddr, .dst_addr = daddr }
>> };
>>
>> struct rte_flow_item_udp udp_item = {
>>                 .hdr = { .src_port = sport, .dst_port = dport }
>> };
>>
>> struct rte_flow_item_vxlan vxlan_item = { .flags = vxlan_flags };
>>
>> struct rte_flow_item pattern[] = {
>>                 { .type = RTE_FLOW_ITEM_TYPE_IPV4, .spec = &ipv4_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_UDP, .spec = &udp_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_VXLAN, .spec = &vxlan_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_END }
>> };
>>
>> The tunnel endpoint can then be create on the port. Whether or not any hw
>> configuration is required at this point would be hw dependent, but if not
>> the context for the TEP is available for use in programming flow, so the
>> application is not forced to redefine the TEP parameters on each flow
>> addition.
>>
>> struct rte_tep *tep = rte_tep_create(port_id, &attrs, pattern);
>>
>> Once the tep context is created flows can then be directed to that endpoint for
>> processing. The following sections will outline how the author envisage flow
>> programming will work and also how TEP acceleration can be combined with other
>> accelerations.
> 
> In order to allow a single TEP context object to be shared by multiple flow
> rules, a whole new API must be implemented and applications still have to
> additionally create one rte_flow rule per TEP flow_id to manage. While this
> probably results in shorter flow rule patterns and action lists, is it
> really worth it?
> 
> While I understand the reasons for this approach, I'd like to push for a
> rte_flow-only API as much as possible, I'll provide suggestions below.
> 

Not only are the rules shorter to implement, it could help to greatly 
reduces the amount of cycles required to add flows, both in terms of the 
application marshaling the data in rte_flow patterns and the PMD parsing 
that those patterns every time a flow is added, in the case where 10k's 
of flow are getting added per second this could add a significant 
overhead on the system.


>> Ingress TEP decapsulation, mark and forward to queue:
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> The flows definition for TEP decapsulation actions should specify the full
>> outer packet to be matched at a minimum. The outer packet definition should
>> match the tunnel definition in the tep context and the tep flow id. This
>> example shows describes matching on the outer, marking the packet with the
>> VXLAN VNI and directing to a specified queue of the port.
>>
>> Source Packet
>>
>>         Decapsulate Outer Hdr
>>       /                       \                                    decap outer crc
>>      /                         \                                    /          \
>>      +-----+------+-----+-------+-----+------+-----+---------+-----+-----------+
>>      | ETH | IPv4 | UDP | VxLAN | ETH | IPv4 | TCP | PAYLOAD | CRC | OUTER CRC |
>>      +-----+------+-----+-------+-----+------+-----+---------+-----+-----------+
>>
>> /* Flow Attributes/Items Definitions */
>>
>> struct rte_flow_attr attr = { .ingress = 1 };
>>
>> struct rte_flow_item_eth eth_item = { .src = s_addr, .dst = d_addr, .type = ether_type };
>> struct rte_flow_item_tep tep_item = { .tep = tep, .id = vni };
>>
>> struct rte_flow_item pattern[] = {
>>                 { .type = RTE_FLOW_ITEM_TYPE_ETH, .spec = &eth_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_TEP, .spec = &tep_item  },
>>                 { .type = RTE_FLOW_ITEM_TYPE_END }
>> };
>>
>> /* Flow Actions Definitions */
>>
>> struct rte_flow_action_decap decap_eth = {
>>                 .type = RTE_FLOW_ITEM_TYPE_ETH,
>>                 .item = { .src = s_addr, .dst = d_addr, .type = ether_type }
>> };
>>
>> struct rte_flow_action_decap decap_tep = {
>>                 .type = RTE_FLOW_ITEM_TYPE_TEP,
>> .spec = &tep_item
>> };
>>
>> struct rte_flow_action_queue queue_action = { .index = qid };
>>
>> struct rte_flow_action_port mark_action = { .index = vni };
>>
>> struct rte_flow_action actions[] = {
>>                 { .type = RTE_FLOW_ACTION_TYPE_DECAP, .conf = &decap_eth },
>>                 { .type = RTE_FLOW_ACTION_TYPE_DECAP, .conf = &decap_tep },
>>                 { .type = RTE_FLOW_ACTION_TYPE_MARK, .conf = &mark_action },
>>                 { .type = RTE_FLOW_ACTION_TYPE_QUEUE, .conf = &queue_action },
>>                 { .type = RTE_FLOW_ACTION_TYPE_END }
>> };
> 
> Assuming there is no dedicated TEP API, how about something like the
> following pseudo-code for a VXLAN-based TEP instead:
> 
>   attr = ingress;
>   pattern = eth / ipv6 / udp / vxlan vni is 42 / end;
>   actions = vxlan_decap / mark id 92 / queue index 8 / end;
>   
>   flow = rte_flow_create(port_id, &attr, pattern, actions, &err);
>   ...
> 
> The VXLAN_DECAP action and its parameters (if any) remain to be defined,
> however VXLAN implies all layers up to and including the first VXLAN header
> encountered. Also, if supported/accepted by a PMD:

I think the idea of parsing upto the VxLAN header makes sense, it would 
also make sense if we go with the opaque TEP object aswell.

> 
>   attr = ingress;
>   pattern = eth / any / udp / vxlan vni is 42 / end;
>   actions = vxlan_decap / mark id 92 / queue index 8 / end;
> 
> => Both outer IPv4 and IPv6 traffic taken into account at once.
> 
>   attr = ingress;
>   pattern = end;
>   actions = vxlan_decap / mark id 92 / queue index 8 / end;
> 
> => All recognized VXLAN traffic regardless of VNI is acted upon. The rest
>     simply passes through.
> 
>> /** VERY IMPORTANT NOTE **/
>> One of the core concepts of this proposal is that actions which modify the
>> packet are defined in the order which they are to be processed. So first decap
>> outer ethernet header, then the outer TEP headers.
>> I think this is not only logical from a usability point of view, it should also
>> simplify the logic required in PMDs to parse the desired actions.
> 
> This. I've been thinking about it for a very long time but never got around
> submit a patch. Handling rte_flow actions in order, allowing repeated
> identical actions and therefore getting rid of DUP. >
> The current approach was a bad design decision from my part, I'm convinced
> it must be redefined before combinations become commonplace (right now no
> PMD implements any action whose order matters as far as I know).
> 

I don't think it was an issue with the original implementation as I 
don't think it really becomes an issue until we start working with 
packet modifications, to that note I think that we only need to limit 
action ordering to actions which modify the packet itself. Actions like 
counting, marking, selecting output, be it port/pf/vf/queue/rss are all 
independent to the actions which modify the packet.

>> struct rte_flow *flow =
>>                                rte_flow_create(port_id, &attr, pattern, actions, &err);
>>
>> The processed packets are delivered to specifed queue with mbuf metadata
>> denoting marked flow id and with mbuf ol_flags PKT_RX_TEP_OFFLOAD set.
>>
>>      +-----+------+-----+---------+-----+
>>      | ETH | IPv4 | TCP | PAYLOAD | CRC |
>>      +-----+------+-----+---------+-----+
> 
> Yes, except for the CRC part which would be optional depending on PMD/HW
> capabilities. Not a big deal.
> sure

>> Ingress TEP decapsulation switch to port:
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> This is intended to represent how a TEP decapsulation could be configured
>> in a switching offload case, it makes an assumption that there is a logical
>> port representation for all ports on the hw switch in the DPDK application,
>> but similar functionality could be achieved by specifying something like a
>> VF ID of the device.
>>
>> Like the previous scenario the flows definition for TEP decapsulation actions
>> should specify the full outer packet to be matched at a minimum but also
>> define the elements of the inner match to match against including masks if
>> required.
>>
>> struct rte_flow_attr attr = { .ingress = 1 };
>>
>> struct rte_flow_item pattern[] = {
>>                 { .type = RTE_FLOW_ITEM_TYPE_ETH, .spec = &outer_eth_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_TEP, .spec = &outer_tep_item, .mask = &tep_mask },
>>                 { .type = RTE_FLOW_ITEM_TYPE_ETH, .spec = &inner_eth_item, .mask = &eth_mask }
>>                 { .type = RTE_FLOW_ITEM_TYPE_IPv4, .spec = &inner_ipv4_item, .mask = &ipv4_mask },
>>                 { .type = RTE_FLOW_ITEM_TYPE_TCP, .spec = &inner_tcp_item, .mask = &tcp_mask },
>>                 { .type = RTE_FLOW_ITEM_TYPE_END }
>> };
>>
>> /* Flow Actions Definitions */
>>
>> struct rte_flow_action_decap decap_eth = {
>>                 .type = RTE_FLOW_ITEM_TYPE_ETH,
>>                 .item = { .src = s_addr, .dst = d_addr, .type = ether_type }
>> };
>>
>> struct rte_flow_action_decap decap_tep = {
>>                 .type = RTE_FLOW_ITEM_TYPE_TEP,
>>                 .item = &outer_tep_item
>> };
>>
>> struct rte_flow_action_port port_action = { .index = port_id };
>>
>> struct rte_flow_action actions[] = {
>>                 { .type = RTE_FLOW_ACTION_TYPE_DECAP, .conf = &decap_eth },
>>                 { .type = RTE_FLOW_ACTION_TYPE_DECAP, .conf = &decap_tep },
>>                 { .type = RTE_FLOW_ACTION_TYPE_PORT, .conf = &port_action },
>>                 { .type = RTE_FLOW_ACTION_TYPE_END }
>> };
>>
>> struct rte_flow *flow = rte_flow_create(port_id, &attr, pattern, actions, &err);
>>
>> This action will forward the decapsulated packets to another port of the switch
>> fabric but no information will on the tunnel or the fact that the packet was
>> decapsulated will be passed with it, thereby enable segregation of the
>> infrastructure and
> 
> Again a suggestion without a dedicated TEP API, matching outer and some
> inner as well:
> 
>   attr = ingress;
>   pattern = eth / ipv6 / udp / vxlan vni is 42 / eth / ipv4 / tcp / end;
>   actions = vxlan_decap / port index 3 / end;
>   /* or */
>   actions = vxlan_decap / vf id 5 / end;
> 
> The PORT action should be defined as well as the converse of the existing
> PORT pattern item (matching an arbitrary physical port). Specifying a PORT
> action would steer traffic to a nondefault physical port.
> 
> The VF action is already correctly defined.
> 
>> Egress TEP encapsulation:
>> ~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> Encapulsation TEP actions require the flow definitions for the source packet
>> and then the actions to do on that, this example shows a ipv4/tcp packet
>> action.
>>
>> Source Packet
>>
>>      +-----+------+-----+---------+-----+
>>      | ETH | IPv4 | TCP | PAYLOAD | CRC |
>>      +-----+------+-----+---------+-----+
>>
>> struct rte_flow_attr attr = { .egress = 1 };
>>
>> struct rte_flow_item_eth eth_item = { .src = s_addr, .dst = d_addr, .type = ether_type };
>> struct rte_flow_item_ipv4 ipv4_item = { .hdr = { .src_addr = src_addr, .dst_addr = dst_addr } };
>> struct rte_flow_item_udp tcp_item = { .hdr = { .src_port = src_port, .dst_port = dst_port } };
>>
>> struct rte_flow_item pattern[] = {
>>                 { .type = RTE_FLOW_ITEM_TYPE_ETH, .spec = &eth_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_IPV4, .spec = &ipv4_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_TCP, .spec = &tcp_item },
>>                 { .type = RTE_FLOW_ITEM_TYPE_END }
>> };
>>
>> /* Flow Actions Definitions */
>>
>> struct rte_flow_action_encap encap_eth = {
>>                 .type = RTE_FLOW_ITEM_TYPE_ETH,
>>                 .item = { .src = s_addr, .dst = d_addr, .type = ether_type }
>> };
>>
>> struct rte_flow_action_encap encap_tep = {
>>                 .type = RTE_FLOW_ITEM_TYPE_TEP,
>>                 .item = { .tep = tep, .id = vni }
>> };
>> struct rte_flow_action_mark port_action = { .index = port_id };
>>
>> struct rte_flow_action actions[] = {
>>                 { .type = RTE_FLOW_ACTION_TYPE_ENCAP, .conf = &encap_tep },
>>                 { .type = RTE_FLOW_ACTION_TYPE_ENCAP, .conf = &encap_eth },
>>                 { .type = RTE_FLOW_ACTION_TYPE_PORT, .conf = &port_action },
>>                 { .type = RTE_FLOW_ACTION_TYPE_END }
>> }
>> struct rte_flow *flow = rte_flow_create(port_id, &attr, pattern, actions, &err);
>>
>>
>>        encapsulating Outer Hdr
>>       /                       \                                      outer crc
>>      /                         \                                   /          \
>>      +-----+------+-----+-------+-----+------+-----+---------+-----+-----------+
>>      | ETH | IPv4 | UDP | VxLAN | ETH | IPv4 | TCP | PAYLOAD | CRC | OUTER CRC |
>>      +-----+------+-----+-------+-----+------+-----+---------+-----+-----------+
> 
> I see three main use cases for egress since we do not want a PMD to parse
> traffic in software to determine if it's candidate for TEP encapsulation:
> 
> 1. Traffic generated/forwarded by an application.
> 2. Same as 1. assuming an application is aware hardware can match egress
>     traffic in addition to encapsulate it.
> 3. Traffic fully processed internally in hardware.
> 
> To handle 1., in my opinion the most common use case, PMDs should rely on an
> application-provided mark pattern item (the converse of the MARK action):
> 
>   attr = egress;
>   pattern = mark is 42 / end;
>   actions = vxlan_encap {many parameters} / end;
> 
> To handle 2, hardware with the ability to recognize and encapsulate outgoing
> traffic is required (applications can rely on rte_flow_validate()):
> 
>   attr = egress;
>   pattern = eth / ipv4 / tcp / end;
>   actions = vxlan_encap {many parameters} / end;
> 
> For 3, a combination of ingress and egress can be used needed on a given
> rule. For clarity, one should assert where traffic comes from and where it's
> supposed to go:
> 
>   attr = ingress egress;
>   pattern = eth / ipv4 / tcp / port id 0 / end;
>   actions = vxlan_encap {many parameters} / vf id 5 / end;
> 
> The {many parameters} for VXLAN_ENCAP obviously remain to be defined,
> they have to either include everything needed to construct L2, L3, L4 and
> VXLAN headers, or separate actions for each layer specified in
> innermost-to-outermost order.
> 
> No need for dedicated mbuf TEP flags.

These all look make sense to me, if we really want to avoid the TEP API, 
just a point on 3, if using port representors then the ingress port can 
be implied by the rule on which the tunnel is created on.

> 
>> Chaining multiple modification actions eg IPsec and TEP
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> For example the definition for full hw acceleration for an IPsec ESP/Transport
>> SA encapsulated in a vxlan tunnel would look something like:
>>
>> struct rte_flow_action actions[] = {
>>                 { .type = RTE_FLOW_ACTION_TYPE_ENCAP, .conf = &encap_tep },
>>                 { .type = RTE_FLOW_ACTION_TYPE_SECURITY, .conf = &sec_session },
>>                 { .type = RTE_FLOW_ACTION_TYPE_ENCAP, .conf = &encap_eth },
>>                 { .type = RTE_FLOW_ACTION_TYPE_END }
>> }
>>
>> 1. Source Packet
>>                             +-----+------+-----+---------+-----+
>>                             | ETH | IPv4 | TCP | PAYLOAD | CRC |
>>                             +-----+------+-----+---------+-----+
>>
>> 2. First Action - Tunnel Endpoint Encapsulation
>>
>>        +------+-----+-------+-----+------+-----+---------+-----+
>>        | IPv4 | UDP | VxLAN | ETH | IPv4 | TCP | PAYLOAD | CRC |
>>        +------+-----+-------+-----+------+-----+---------+-----+
>>
>> 3. Second Action - IPsec ESP/Transport Security Processing
>>
>>        +------+-----+-----+-------+-----+------+-----+---------+-----+-------------+
>>        | IPv4 | ESP |              ENCRYPTED PAYLOAD                 | ESP TRAILER |
>>        +------+-----+-----+-------+-----+------+-----+---------+-----+-------------+
>>
>> 4. Third Action - Outer Ethernet Encapsulation
>>
>> +-----+------+-----+-----+-------+-----+------+-----+---------+-----+-------------+-----------+
>> | ETH | IPv4 | ESP |              ENCRYPTED PAYLOAD                 | ESP TRAILER | OUTER CRC |
>> +-----+------+-----+-----+-------+-----+------+-----+---------+-----+-------------+-----------+
>>
>> This example demonstrates the importance of making the interoperation of
>> actions to be ordered, as in the above example, a security
>> action can be defined on both the inner and outer packet by simply placing
>> another security action at the beginning of the action list.
>>
>> It also demonstrates the rationale for not collapsing the Ethernet into
>> the TEP definition as when you have multiple encapsulating actions, all
>> could potentially be the place where the Ethernet header needs to be
>> defined.
> 
> For completeness, here's a suggested alternative with neither dedicated TEP
> nor security APIs:
> 
>   attr = egress;
>   pattern = mark is 42 / end;
>   actions = vxlan_encap {many parameters} / esp_encap {many parameters} / eth_encap {many parameters} / end;
> 
> Note ESP_ENCAP is not so easy given some data must be provided by the
> application with each transmitted packet. The current security API does not
> provide means to perform ESP encapsulation, it instead focuses on encryption
> and relies on the application to prepare headers and allocate room for the
> trailer. It's an unrealistic use case at the moment but shows the potential
> of such an API.
> 
The full IPsec is currently being enabled, and was always developed with 
allow full encap/decap offload.

> - First question is what's your opinion regarding focusing on rte_flow
>    instead of a TEP API? (Note for counters: one could add COUNT actions as
>    well, what's currently missing is a way to share counters among several
>    flow rules, which is planned as well)
>
Technically I see no issue with both approaches being workable, but I 
think the flow based approach has issues in terms of usability and 
performance. In my mind, thinking of a TEP as a logical object which 
flows get mapped into maps very closely to the how they are used 
functionally in networks deployments, and is the way I've seen them 
supported in ever TOR switch API/CLI I've ever used. I also think it add 
should enable a more preformant control path when you don't need to 
specify all the TEP parameters for every flow, this is not an 
inconsiderable overhead. I saying all that I do see the value in the 
cleanness at an API level of using purely rte_flow, although I do wonder 
will that just end up moving that into the application domain.

> - Regarding dedicated encap/decap actions instead of generic ones, given all
>    protocols have different requirements (e.g. ESP encap is on a whole
>    different level of complexity and likely needs callbacks)?
> 
Agreed on the need for dedicated encap/decap TEP actions.

> - Regarding the reliance on a MARK meta pattern item as a standard means for
>    applications to tag egress traffic so a PMD knows what to do?

I do like that it as an approach but how would it work for combined 
actions, TEP + IPsec SA

> 
> - I'd like to send a deprecation notice for rte_flow regarding handling of
>    actions (documentation and change in some PMDs to reject currently valid
>    but seldom used flow rules accordingly) instead of a new flow
>    attribute. Would you ack such a change for 18.05?
> 

Apologies, I complete missed the ack for 18.05 part of the question when 
I read it first this mail, the answer would have been yes, I was out of 
office due to illness for part of that week, which was part of the 
reason for the delay in response to this mail. But I think if we only 
restrict the action ordering requirement to chained modification actions 
do we still need the deprecation notice, as it won't break any existing 
implementations, as as you note there isn't anyone supporting that yet?



More information about the dev mailing list