[dpdk-dev] [RFC] Generic flow director/filtering/classification API

John Fastabend john.fastabend at gmail.com
Wed Aug 3 21:11:56 CEST 2016


[...]

>>>>>> The proposal looks very good.  It satisfies most of the features
>>>>>> supported by Chelsio NICs.  We are looking for suggestions on exposing
>>>>>> more additional features supported by Chelsio NICs via this API.
>>>>>>
>>>>>> Chelsio NICs have two regions in which filters can be placed -
>>>>>> Maskfull and Maskless regions.  As their names imply, maskfull region
>>>>>> can accept masks to match a range of values; whereas, maskless region
>>>>>> don't accept any masks and hence perform a more strict exact-matches.
>>>>>> Filters without masks can also be placed in maskfull region.  By
>>>>>> default, maskless region have higher priority over the maskfull region.
>>>>>> However, the priority between the two regions is configurable.
>>>>>
>>>>> I understand this configuration affects the entire device. Just to be clear,
>>>>> assuming some filters are already configured, are they affected by a change
>>>>> of region priority later?
>>>>>
>>>>
>>>> Both the regions exist at the same time in the device.  Each filter can
>>>> either belong to maskfull or the maskless region.
>>>>
>>>> The priority is configured at time of filter creation for every
>>>> individual filter and cannot be changed while the filter is still
>>>> active. If priority needs to be changed for a particular filter then,
>>>> it needs to be deleted first and re-created.
>>>
>>> Could you model this as two tables and add a table_id to the API? This
>>> way user space could populate the table it chooses. We would have to add
>>> some capabilities attributes to "learn" if tables support masks or not
>>> though.
>>>
>>
>> This approach sounds interesting.
> 
> Now I understand the idea behind these tables, however from an application
> point of view I still think it's better if the PMD could take care of flow
> rules optimizations automatically. Think about it, PMDs have exactly a
> single kind of device they know perfectly well to manage, while applications
> want the best possible performance out of any device in the most generic
> fashion.

The problem is keeping priorities in order and/or possibly breaking
rules apart (e.g. you have an L2 table and an L3 table) becomes very
complex to manage at driver level. I think its easier for the
application which has some context to do this. The application "knows"
if its a router for example will likely be able to pack rules better
than a PMD will.

> 
>>> I don't see how the PMD can sort this out in any meaningful way and it
>>> has to be exposed to the application that has the intelligence to 'know'
>>> priorities between masks and non-masks filters. I'm sure you could come
>>> up with something but it would be less than ideal in many cases I would
>>> guess and we can't have the driver getting priorities wrong or we may
>>> not get the correct behavior.
> 
> It may be solved by having the PMD maintain a SW state to quickly know which
> rules are currently created and in what state the device is so basically the
> application doesn't have to perform this work.
> 
> This API allows applications to express basic needs such as "redirect
> packets matching this pattern to that queue". It must not deal with HW
> details and limitations in my opinion. If a request cannot be satisfied,
> then the rule cannot be created. No help from the application must be
> expected by PMDs, otherwise it opens the door to the same issues as the
> legacy filtering APIs.

This depends on the application and what/how it wants to manage the
device. If the application manages a pipeline with some set of tables,
then mapping this down to a single table, which then the PMD has to
unwind back to a multi-table topology to me seems like a waste.

> 
> [...]
>>>> Unfortunately, our maskfull region is extremely small too compared to
>>>> maskless region.
>>>>
>>>
>>> To me this means a userspace application would want to pack it
>>> carefully to get the full benefit. So you need some mechanism to specify
>>> the "region" hence the above table proposal.
>>>
>>
>> Right. Makes sense.
> 
> I do not agree, applications should not be aware of it. Note this case can
> be handled differently, so that rules do not have to be moved back and forth
> between both tables. If the first created rule requires a maskfull entry,
> then all subsequent rules will be entered into that table. Otherwise no
> maskfull entry can be created as long as there is one maskless entry. When
> either table is full, no more rules may be added. Would that work for you?
> 

Its not about mask vs no mask. The devices with multiple tables that I
have don't have this mask limitations. Its about how to optimally pack
the rules and who implements that logic. I think its best done in the
application where I have the context.

Is there a way to omit the table field if the PMD is expected to do
a best effort and add the table field if the user wants explicit
control over table mgmt. This would support both models. I at least
would like to have explicit control over rule population in my pipeline
for use cases where I'm building a pipeline on top of the hardware.

>> [...]
>>>>> Now about this "promisc" match criteria, it can be added as a new meta
>>>>> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
>>>>> start or add it later with the related code in your PMD?
>>>>>
>>>>
>>>> It could be added as a meta item.  If there are other interested
>>>> parties, it can be added now.  Otherwise, we'll add it with our filtering
>>>> related code.
>>>>
>>>
>>> hmm I guess by "promisc" here you mean match packets received from the
>>> wire before they have been switched by the silicon?
>>>
>>
>> Match packets received from wire before they have been switched by
>> silicon, and which also includes packets not destined for DUT and were
>> still received due to interface being in promisc mode.
> 
> I think it's fine, but we'll have to precisely define what happens when a
> packet matched with such pattern is part of a terminating rule. For instance
> if it is duplicated by HW, then the rule cannot be terminating.
> 
> [...]
>>>> This raises another interesting question.  What should the PMD do
>>>> if it has support to only a subset of fields in the particular item?
>>>>
>>>> For example, if a rule has been sent to match IP fragmentation along
>>>> with several other IPv4 fields, and if the underlying hardware doesn't
>>>> support matching based on IP fragmentation, does the PMD reject the
>>>> complete rule although it could have done the matching for rest of the
>>>> IPv4 fields?
>>>
>>> I think it has to fail the command other wise user space will not have
>>> any way to understand that the full match criteria can not be met and
>>> we will get different behavior for the same applications on different
>>> nics depending on hardware feature set. This will most likely break
>>> applications so we need the error IMO.
>>>
>>
>> Ok. Makes sense.
> 
> Yes, I fully agree with this.
> 
>>>>>> - Match range of physical ports on the NIC in a single rule via masks.
>>>>>>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
>>>>>>   ports available on the NIC.
>>>>>
>>>>> Applications create flow rules per port, I'd thus suggest that the PMD
>>>>> should detect identical rules created on different ports and aggregate them
>>>>> as a single HW rule automatically.
>>>>>
>>>>> If you think this approach is not right, the alternative is a meta pattern
>>>>> item that provides a list of ports. I'm not sure this is the right approach
>>>>> considering it would most likely not be supported by most NICs. Applications
>>>>> may not request it explicitly.
>>>>>
>>>>
>>>> Aggregating via PMD will be expensive operation since it would involve:
>>>> - Search of existing filters.
>>>> - Deleting those filters.
>>>> - Creating a single combined filter.
>>>>
>>>> And all of above 3 operations would need to be atomic so as not to
>>>> affect existing traffic which is hitting above filters.
> 
> Atomicity may not be a problem if the PMD makes sure the new combined rule
> is inserted before the others, so they do not need to be removed either.
> 
>>>> Adding a
>>>> meta item would be a simpler solution here.
> 
> Yes, clearly.
> 
>>> For this adding a meta-data item seems simplest to me. And if you want
>>> to make the default to be only a single port that would maybe make it
>>> easier for existing apps to port from flow director. Then if an
>>> application cares it can create a list of ports if needed.
>>>
>>
>> Agreed.
> 
> However although I'm not opposed to adding dedicated meta items, remember
> applications will not automatically benefit from the increased performance
> if a single PMD implements this feature, their maintainers will probably not
> bother with it.
> 

Unless as we noted in other thread the application is closely bound to
its hardware for capability reasons. In this case it would make sense
to implement.

>>>>>> - Match range of Physical Functions (PFs) on the NIC in a single rule
>>>>>>   via masks. For ex: match all traffic coming on several PFs.
>>>>>
>>>>> The PF and VF pattern items assume there is a single PF associated with a
>>>>> DPDK port. VFs are identified with an ID. I basically took the same
>>>>> definitions as the existing filter types, perhaps this is not enough for
>>>>> Chelsio adapters.
>>>>>
>>>>> Do you expose more than one PF for a DPDK port?
>>>>>
>>>>> Anyway, I'd suggest the same approach as above, automatic aggregation of
>>>>> rules for performance reasons, otherwise new or updated PF/VF pattern items,
>>>>> in which case it would be great if you could provide ideal structure
>>>>> definitions for this use case.
>>>>>
>>>>
>>>> In Chelsio hardware, all the ports of a device are exposed via single
>>>> PF4. There could be many VFs attached to a PF.  Physical NIC functions
>>>> are operational on PF4, while VFs can be attached to PFs 0-3.
>>>> So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
>>>> mapping assumption.
>>>>
>>>> There already seems to be a PF meta-item, but it doesn't seem to accept
>>>> any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
>>>> seem to accept a "mask" field.  We could probably enable these fields
>>>> in the PF and VF meta-items to allow configuration.
>>>
>>> Maybe a range field would help here as well? So you could specify a VF
>>> range. It might be one of the things to consider adding later though if
>>> there is no clear use for it now.
>>>
>>
>> VF-value and VF-mask would help to achieve the desired filter.
>> VF-mask would also enable to specify a range of VF values.
> 
> Like John, I think a range or even a list instead of a mask would be better,
> the PMD can easily create a mask from that if necessary. Reason is that
> we've always had bad experiences with bit-fields, they're always too short
> at some point and we would like to avoid having to break the ABI to update
> existing pattern items later.

Agreed avoiding bit-fields is a good idea.

> 
> Also while I don't think this is the case yet, perhaps it will be a good
> idea for PFs/VFs to have global unique IDs, just like DPDK ports.
> 
> Thanks.
> 



More information about the dev mailing list