[dpdk-dev] [RFC] Generic flow director/filtering/classification API

Chandran, Sugesh sugesh.chandran at intel.com
Fri Jul 15 11:23:26 CEST 2016


Thank you Adrien,
Please find below for some more comments/inputs

Let me know your thoughts on this.


Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Wednesday, July 13, 2016 9:03 PM
> To: Chandran, Sugesh <sugesh.chandran at intel.com>
> Cc: dev at dpdk.org; Thomas Monjalon <thomas.monjalon at 6wind.com>;
> Zhang, Helin <helin.zhang at intel.com>; Wu, Jingjing
> <jingjing.wu at intel.com>; Rasesh Mody <rasesh.mody at qlogic.com>; Ajit
> Khaparde <ajit.khaparde at broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy at chelsio.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>;
> Jan Medala <jan at semihalf.com>; John Daley <johndale at cisco.com>; Chen,
> Jing D <jing.d.chen at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; Matej Vido <matejvido at gmail.com>;
> Alejandro Lucero <alejandro.lucero at netronome.com>; Sony Chacko
> <sony.chacko at qlogic.com>; Jerin Jacob
> <jerin.jacob at caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch at intel.com>; Olga Shern <olgas at mellanox.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Mon, Jul 11, 2016 at 10:42:36AM +0000, Chandran, Sugesh wrote:
> > Hi Adrien,
> >
> > Thank you for your response,
> > Please see my comments inline.
> 
> Hi Sugesh,
> 
> Sorry for the delay, please see my answers inline as well.
> 
> [...]
> > > > > Flow director
> > > > > -------------
> > > > >
> > > > > Flow director (FDIR) is the name of the most capable filter
> > > > > type, which covers most features offered by others. As such, it
> > > > > is the most
> > > widespread
> > > > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > > > >
> > > > > It is also the only type that allows an arbitrary 32 bits value
> > > > > provided by applications to be attached to a filter and returned
> > > > > with matching packets instead of relying on the destination queue to
> recognize flows.
> > > > >
> > > > > Unfortunately, even FDIR requires applications to be aware of
> > > > > low-level capabilities and limitations (most of which come
> > > > > directly from **ixgbe**
> > > and
> > > > > **i40e**):
> > > > >
> > > > > - Bitmasks are set globally per device (port?), not per filter.
> > > > [Sugesh] This means application cannot define filters that matches
> > > > on
> > > arbitrary different offsets?
> > > > If that’s the case, I assume the application has to program
> > > > bitmask in
> > > advance. Otherwise how
> > > > the API framework deduce this bitmask information from the rules??
> > > > Its
> > > not very clear to me
> > > > that how application pass down the bitmask information for
> > > > multiple filters
> > > on same port?
> > >
> > > This is my understanding of how flow director currently works,
> > > perhaps someome more familiar with it can answer this question better
> than I could.
> > >
> > > Let me take an example, if particular device can only handle a
> > > single IPv4 mask common to all flow rules (say only to match
> > > destination addresses), updating that mask to also match the source
> > > address affects all defined and future flow rules simultaneously.
> > >
> > > That is how FDIR currently works and I think it is wrong, as it
> > > penalizes devices that do support individual bit-masks per rule, and
> > > is a little awkward from an application point of view.
> > >
> > > What I suggest for the new API instead is the ability to specify one
> > > bit-mask per rule, and let the PMD deal with HW limitations by
> > > automatically configuring global bitmasks from the first added rule,
> > > then refusing to add subsequent rules if they specify a conflicting
> > > bit-mask. Existing rules remain unaffected that way, and
> > > applications do not have to be extra cautious.
> > >
> > [Sugesh] The issue with that approach is, the hardware simply discards
> > the rule when it is a super set of first one eventhough the hardware
> > is capable of handling it. How its guaranteed the first rule will set
> > the bitmask for all the subsequent rules.
> 
> Just to clarify, the API only says that new rules cannot affect existing ones
> (which I think makes sense from a user's perspective), so as long as the PMD
> does whatever is needed to make all rules work together, there should not
> be any problem with this approach.
> 
> Even if the PMD has to temporarily remove an existing rule and reconfigure
> global masks in order to add subsequent rules, it is fine as long as packets
> aren't misdirected in the meantime (they may be dropped if there is no
> other choice).
[Sugesh] I feel this is fine. Thank you for confirming.
> 
> > How about having a CLASSIFER_TYPE for the classifier. Every port can
> > have set of supported flow types(for eg: L3_TYPE, L4_TYPE,
> > L4_TYPE_8BYTE_FLEX,
> > L4_TYPE_16BYTE_FLEX) based on the underlying FDIR support. Application
> > can query this and set the type accordingly while initializing the
> > port. This way the first rule need not set all the bits that may needed in the
> future rules.
> 
> Again from a user's POV, I think doing so would add unwanted HW-specific
> complexity.
> 
> However this concern can be handled through a different approach. Let's say
> user creates a pattern that only specifies a IP header with a given bit-mask.
> 
> In FDIR language this translates to:
> 
> - Set global mask for IPv4 accordingly, remaining global masks all zeroed
>   (assumed default value).
> 
> - Create an IPv4 flow.
> 
> From now on, all rules specifying a IPv4 header must have this exact bitmask
> (implicitly or explicitly), otherwise they cannot be created, i.e. the global
> bitmask for IPv4 becomes immutable.
> 
> Now user creates a TCPv4 rule (as long as it uses the same IPv4 mask), to
> handle this FDIR would:
> 
> - Keep global immutable mask for IPv4 unchanged, set global TCP mask
>   according to the flow rule.
> 
> - Create a TCPv4 flow.
> 
> From this point on, like IPv4, subsequent TCP rules must have this exact
> bitmask and so on as the global bitmask becomes immutable.
> 
> Basically, only protocol bit-masks affected by existing flow rules are
> immutable, others can be changed later. Global flow masks for protocols
> become mutable again when no existing flow rule uses them.
> 
> Does it look fine for you?
[Sugesh] This looks fine for me. 
> 
> [...]
> > > > > +--------------------------+
> > > > > | Copy to queue 8          |
> > > > > +==========+===============+
> > > > > | PASSTHRU |               |
> > > > > +----------+-----------+---+
> > > > > | QUEUE    | ``queue`` | 8 |
> > > > > +----------+-----------+---+
> > > > >
> > > > > ``ID``
> > > > > ^^^^^^
> > > > >
> > > > > Attaches a 32 bit value to packets.
> > > > >
> > > > > +----------------------------------------------+
> > > > > | ID                                           |
> > > > > +========+=====================================+
> > > > > | ``id`` | 32 bit value to return with packets |
> > > > > +--------+-------------------------------------+
> > > > >
> > > > [Sugesh] I assume the application has to program the flow with a
> > > > unique ID and matching packets are stamped with this ID when
> > > > reporting to the software. The uniqueness of ID is NOT guaranteed
> > > > by the API framework. Correct me if I am wrong here.
> > >
> > > You are right, if the way I wrote it is not clear enough, I'm open
> > > to suggestions to improve it.
> > [Sugesh] I guess its fine and would like to confirm the same. Perhaps
> > it would be nice to mention that the IDs are application defined.
> 
> OK, I will make it clearer.
> 
> > > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible
> > > > to have a
> > > > 64 bit ID? So that application can use the control plane flow
> > > > pointer Itself as an ID. Does it make sense?
> > >
> > > I've specified a 32 bit ID for now because this is what FDIR
> > > supports and also what existing devices can report today AFAIK (i40e and
> mlx5).
> > >
> > > We could use 64 bit for future-proofness in a separate action like "ID64"
> > > when at least one device supports it.
> > >
> > > To PMD maintainers: please comment if you know devices that support
> > > tagging matching packets with more than 32 bits of user-provided
> > > data!
> > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says so.
> > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with rss
> > hash. This can be a software driver limitation that expose only 32
> > bit. Possibly because of cache alignment issues? Since the hardware
> > can support 64 bit, I feel it make sense to support 64 bit as well.
> 
> I agree we need 64 bit support, but then we also need a solution for devices
> that support only 32 bit. Possible methods I can think of:
> 
> - A separate "ID64" action (or a "ID32" one, perhaps with a better name).
> 
> - A single ID action with an unlimited number of bytes to return with
>   packets (would actually be a string). PMDs can then refuse to create flow
>   rules requesting an unsupported number of bytes. Devices supporting
> fewer
>   than 32 bits are also included this way without the need for yet another
>   action.
> 
> Thoughts?
[Sugesh] I feel the single ID approach is much better. But I would say a fixed size ID
is easy to handle at upper layers. Say PMD returns 64bit ID in which MSBs 
are masked out, based on how many bits the hardware can support. 
PMD can refuse the unsupported number of bytes when requested. So the size
of ID going to be a parameter to program the flow.
What do you think?
> 
> [...]
> > > > [Sugesh] Another concern is the cost and time of installing these
> > > > rules in the hardware. Can we make these APIs time bound(or at
> > > > least an option
> > > to
> > > > set the time limit to execute these APIs), so that Application
> > > > doesn’t have to wait so long when installing and deleting flows
> > > with
> > > > slow hardware/NIC. What do you think? Most of the datapath flow
> > > installations are
> > > > dynamic and triggered only when there is an ingress traffic. Delay
> > > > in flow insertion/deletion have unpredictable
> > > consequences.
> > >
> > > This API is (currently) aimed at the control path only, and must
> > > indeed be assumed to be slow. Creating million of rules may take
> > > quite long as it may involve syscalls and other time-consuming
> > > synchronization things on the PMD side.
> > >
> > > So currently there is no plan to have rules added from the data path
> > > with time constraints. I think it would be implemented through a
> > > different set of functions anyway.
> > >
> > > I do not think adding time limits is practical, even specifying in
> > > the API that creating a single flow rule must take less than a
> > > maximum number of seconds in order to be effective is too much of a
> > > constraint (applications that create all flows during init may not care after
> all).
> > >
> > > You should consider in any case that modifying flow rules will
> > > always be slower than receiving packets, there is no way around
> > > that. Applications have to live with it and provide a software
> > > fallback for incoming packets while managing flow rules.
> > >
> > > Moreover, think about what happens when you hit the maximum
> number
> > > of flow rules and cannot create any more. Applications need to
> > > implement some kind of fallback in their data path.
> > >
> > > Offloading flows in HW is also only useful if they live much longer
> > > than the time taken to create and delete them. Perhaps applications
> > > may choose to do so after detecting long lived flows such as TCP
> > > sessions.
> > >
> > > You may have one separate control thread dedicated to manage flows
> > > and keep your normal control thread unaffected by delays. Several
> > > threads can even be dedicated, one per device.
> > [Sugesh] I agree that the flow insertion cannot be as fast as the
> > packet receiving rate.  From application point of view the problem
> > will be when hardware flow insertion takes longer than software flow
> > insertion. At least application has to know the cost of
> > inserting/deleting a rule in hardware beforehand. Otherwise how
> > application can choose the right flow candidate for hardware. My point
> here is application is expecting a deterministic behavior from a classifier while
> inserting and deleting rules.
> 
> Understood, however it will be difficult to estimate, particularly if a PMD
> must rearrange flow rules to make room for a new one due to priority levels
> collision or some other HW-related reason. I mean, spent time cannot be
> assumed to be constant, even PMDs cannot know in advance because it also
> depends on the performance of the host CPU.
> 
> Such applications may find it easier to measure elapsed time for the rules
> they create, make statistics and extrapolate from this information for future
> rules. I do not think the PMD can help much here.
[Sugesh] From an application point of view this can be an issue. 
Even there is a security concern when we program a short lived flow. Lets consider the case, 

1) Control plane programs the hardware with Queue termination flow.
2) Software dataplane programmed to treat the packets from the specific queue accordingly.
3) Remove the flow from the hardware. (Lets consider this is a long wait process..). 
Or even there is a chance that hardware take more time to report the status than removing it 
physically . Now the packets in the queue no longer consider as matched/flow hit.
. This is due to the software dataplane update is yet to happen.
We must need a way to sync between software datapath and classifier APIs even though 
they are both programmed from a different control thread.

Are we saying these APIs are only meant for user defined static flows??


> 
> > > > [Sugesh] Another query is on the synchronization part. What if
> > > > same rules
> > > are
> > > > handled from different threads? Is application responsible for
> > > > handling the
> > > concurrent
> > > > hardware programming?
> > >
> > > Like most (if not all) DPDK APIs, applications are responsible for
> > > managing locking issues as decribed in 4.3 (Behavior). Since this is
> > > a control path API and applications usually have a single control
> > > thread, locking should not be necessary in most cases.
> > >
> > > Regarding my above comment about using several control threads to
> > > manage different devices, section 4.3 says:
> > >
> > >  "There is no provision for reentrancy/multi-thread safety, although
> > > nothing  should prevent different devices from being configured at
> > > the same  time. PMDs may protect their control path functions
> accordingly."
> > >
> > > I'd like to emphasize it is not "per port" but "per device", since
> > > in a few cases a configurable resource is shared by several ports.
> > > It may be difficult for applications to determine which ports are
> > > shared by a given device but this falls outside the scope of this API.
> > >
> > > Do you think adding the guarantee that it is always safe to
> > > configure two different ports simultaneously without locking from
> > > the application side is necessary? In which case the PMD would be
> > > responsible for locking shared resources.
> > [Sugesh] This would be little bit complicated when some of ports are
> > not under DPDK itself(what if one port is managed by Kernel) Or ports
> > are tied by different application. Locking in PMD helps when the ports
> > are accessed by multiple DPDK application. However what if the port itself
> not under DPDK?
> 
> Well, either we do not care about what happens outside of the DPDK
> context, or PMDs must find a way to satisfy everyone. I'm not a fan of locking
> either but it would be nice if flow rules configuration could be attempted on
> different ports simultaneously without the risk of wrecking anything, so that
> applications do not need to care.
> 
> Possible cases for a dual port device with global flow rule settings affecting
> both ports:
> 
> 1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that needs
>    to alter a global setting necessary for an existing rule on any port is
>    not allowed (EEXIST). PMD must maintain a device context common to both
>    ports in order for this to work. This context is either under lock, or
>    the first port on which a flow rule is created owns all future flow
>    rules.
> 
> 2) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
>    it and knows that port 2 may modify the global context: no flow rules can
>    be created from the DPDK application due to safety issues (EBUSY?).
> 
> 3) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
>    it and knows that port 2 will not modify flow rules: PMD should not care,
>    no lock necessary.
> 
> 4) port 1 is managed by DPDK, port 2 by something else and the PMD is not
>    aware of it: either flow rules cannot be created ever at all, or we say
>    it is user's reponsibility to make sure this does not happen.
> 
> Considering that most control operations performed by DPDK affect the
> device regardless of other applications, I think 1) is the only case that should
> be defined, otherwise 4), defined as user's responsibility.
> 
> > > > > Destruction
> > > > > ~~~~~~~~~~~
> > > > >
> > > > > Flow rules destruction is not automatic, and a queue should not
> > > > > be
> > > released
> > > > > if any are still attached to it. Applications must take care of
> > > > > performing this step before releasing resources.
> > > > >
> > > > > ::
> > > > >
> > > > >  int
> > > > >  rte_flow_destroy(uint8_t port_id,
> > > > >                   struct rte_flow *flow);
> > > > >
> > > > >
> > > > [Sugesh] I would suggest having a clean-up API is really useful as
> > > > the
> > > releasing of
> > > > Queue(is it applicable for releasing of port too?) is not
> > > > guaranteeing the
> > > automatic flow
> > > > destruction.
> > >
> > > Would something like rte_flow_flush(port_id) do the trick? I wanted
> > > to emphasize in this first draft that applications should really
> > > keep the flow pointers around in order to manage/destroy them. It is
> > > their responsibility, not PMD's.
> > [Sugesh] Thanks, I think the flush call will do.
> 
> Noted, will add it.
> 
> > > > This way application can initialize the port, clean-up all the
> > > > existing rules and create new rules  on a clean slate.
> > >
> > > No resource can be released as long as a flow rule is using it (bad
> > > things may happen otherwise), all flow rules must be destroyed
> > > first, thus none can possibly remain after initializing a port. It
> > > is assumed that PMDs do automatic clean up during init if necessary to
> ensure this.
> > [Sugesh] That will do.
> 
> I will make it more explicit as well.
> 
> [...]
> 
> --
> Adrien Mazarguil
> 6WIND


More information about the dev mailing list