[dpdk-dev] [RFC] Generic flow director/filtering/classification API

Adrien Mazarguil adrien.mazarguil at 6wind.com
Thu Jul 21 14:47:34 CEST 2016


Hi Wenzhuo,

It seems that we agree on about everything now, just a few more comments
below after snipping the now irrelevant parts.

On Thu, Jul 21, 2016 at 03:18:11AM +0000, Lu, Wenzhuo wrote:
[...]
> > > > > Does it mean PMD should store and maintain all the rules? Why not
> > > > > let rte do
> > > > that? I think if PMD maintain all the rules, it means every kind of
> > > > NIC should have a copy of code for the rules. But if rte do that,
> > > > only one copy of code need to be maintained, right?
> > > >
> > > > I've considered having rules stored in a common format understood at
> > > > the RTE level and not specific to each PMD and decided that the
> > > > opaque rte_flow pointer was a better choice for the following reasons:
> > > >
> > > > - Even though flow rules management is done in the control path, processing
> > > >   must be as fast as possible. Letting PMDs store flow rules using their own
> > > >   internal representation gives them the chance to achieve better
> > > >   performance.
> > > Not quite understand. I think we're talking about maintain the rules by SW. I
> > don’t think there's something need to be optimized according to specific NICs. If
> > we need to optimize the code, I think we need to consider the CPU, OS ... and
> > some common means. I'm wrong?
> > 
> > Perhaps we were talking about different things, here I was only explaining why
> > rte_flow (the result of creating a flow rule) should be opaque and fully managed
> > by the PMD. More on the SW side of things below.
> > 
> > > > - An opaque context managed by PMDs would probably have to be stored
> > > >   somewhere as well anyway.
> > > >
> > > > - PMDs may not need to allocate/store anything at all if they exclusively
> > > >   rely on HW state for everything. In my opinion, the generic API has enough
> > > >   constraints for this to work and maintain consistency between flow
> > > >   rules. Note this is currently how most PMDs implement FDIR and other
> > > >   filter types.
> > > Yes, the rules are stored by HW. But considering stop/start the device, the
> > rules in HW will lose. we have to store the rules by SW and re-program them
> > when restarting the device.
> > 
> > Assume a HW capable of keeping flow rules programmed even during a
> > stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of view),
> > don't you think it is more efficient to standardize on this behavior and let PMDs
> > restore flow rules for HW that do not support it regardless of whether it would
> > be done by RTE or the application (SW)?
> Didn’t know that. As some NICs have already had the ability to keep the rules during a stop/start cycle, maybe it could be a trend :)

Well yeah, if you are wondering about that, these PMDs do not have the same
definition for port stop/start as lower level PMDs like ixgbe and i40e. In
the mlx4/mlx5 cases, most control path operations (queue creation,
destruction and general management) end up performed by kernel
drivers. Stopping a port does not really shut it down as the kernel still
manages its own netdevice independently.

[...]
> > > > - The flow rules format described in this specification (pattern / actions)
> > > >   will be used by applications directly, and will be free to arrange them in
> > > >   lists, trees or in any other way if they need to keep flow specifications
> > > >   around for further processing.
> > > Who will create the lists, trees or something else? According to previous
> > discussion, I think the APP will program the rules one by one. So if APP organize
> > the rules to lists, trees..., PMD doesn’t know that.
> > > And you said " Given that the opaque rte_flow pointer associated with a flow
> > rule is to be stored by the application ". I'm lost here.
> > 
> > I guess that's because we're discussing two different things, flow rule
> > specifications and flow rule objects. Let me sum it up:
> > 
> > - Flow rule specifications are the patterns/actions combinations provided by
> >   applications to rte_flow_create(). Applications can store those as needed
> >   and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
> >   will do it for them.
> > 
> > - Flow rule objects (struct rte_flow *) are generated when a flow rule is
> >   created. Applications must keep these around if they want to manipulate
> >   them later (i.e. destroy or query existing rules).
> Thanks for this clarification. So the specifications can be different with objects, right? The specifications are what the APP wants, the objects are what the APP really gets. As rte_flow_create can fail. Right?

Yes, precisely. Apps are also free to keep specifications around even in the
event of a flow creation failure. I think a generic software fallback will
be provided at some point.

[...]
> > What we seem to not agree about is that you think RTE should be responsible
> > for restoring flow rules of devices that lose them when stopped. I think doing so
> > is unfair to devices for which it is not the case and not really nice to applications,
> > so my opinion is that the PMD is responsible for restoring flow rules however it
> > wants. It is free to use RTE helpers to keep their track, as long as it's all managed
> > internally.
> What I think is RTE can store the flow rules and recreate them after restarting, in the function like rte_dev_start, so APP knows nothing about it. But according to the discussing above, I think the design doesn't support it, right?

Yes. Right now the design explictly states that PMDs are on their own
regarding this (4.3 Behavior). While it could be modified, I really think it
would be less efficient for the reasons stated above.

> RTE doesn't store the flow rules objects and event it stores them, there's no way designed to re-program the objects. And also considering some HW doesn't need to be re-programed. I think it's OK  to let PMD maintain the rules as the re-programing is a NIC specific requirement.

Great to finally agree on this point.

> > > > Thus from an application point of view, whatever happens when
> > > > stopping and restarting a port should not matter. If a flow rule was
> > > > present before, it must still be present afterwards. If the PMD had
> > > > to destroy flow rules and re-create them, it does not actually matter if they
> > differ slightly at the HW level, as long as:
> > > >
> > > > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> > > >   and refer to the same rules.
> > > >
> > > > - The overall behavior of all rules is the same.
> > > >
> > > > The list of rules you think of (patterns / actions) is maintained by
> > > > applications (not RTE), and only if they need them. RTE would needlessly
> > duplicate this.
> > > As said before, need more details to understand this. Maybe an example
> > > is better :)
> > 
> > The generic format both RTE and applications might understand is the one
> > described in this API (struct rte_flow_pattern and struct rte_flow_actions).
> > 
> > If we wanted RTE to maintain some sort of per-port state for flow rule
> > specifications, it would have to be a copy of these structures arranged somehow
> > (list or something else).
> > 
> > If we consider that PMDs need to keep a context object associated to a flow
> > rule (the opaque struct rte_flow *), then RTE would most likely have to store it
> > along with the flow specification.
> > 
> > Such a list may not be useful to applications (list lookups take time), so they
> > would implement their own redundant method. They might also require extra
> > room to attach some application context to flow rules. A generic list cannot plan
> > for it.
> > 
> > Applications know what they want to do with flow rules and are responsible for
> > managing them efficiently with RTE out of the way.
> > 
> > I'm not sure if this answered your question, if not, please describe a scenario
> > where a RTE-managed list of flow rules would be mandatory.
> Got your point and agree :)

Thanks.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list