[dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

Ferruh Yigit ferruh.yigit at intel.com
Wed May 17 17:37:10 CEST 2017


On 5/17/2017 3:54 PM, Ananyev, Konstantin wrote:
> Hi Ferruh,
> Please see my comments/questions below.
> Thanks
> Konstantin
> 
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE Flow Classify Library
>> + *
>> + * This library provides flow record information with some measured properties.
>> + *
>> + * Application can select variety of flow types based on various flow keys.
>> + *
>> + * Library only maintains flow records between rte_flow_classify_stats_get()
>> + * calls and with a maximum limit.
>> + *
>> + * Provided flow record will be linked list rte_flow_classify_stat_xxx
>> + * structure.
>> + *
>> + * Library is responsible from allocating and freeing memory for flow record
>> + * table. Previous table freed with next rte_flow_classify_stats_get() call and
>> + * all tables are freed with rte_flow_classify_type_reset() or
>> + * rte_flow_classify_type_set(x, 0). Memory for table allocated on the fly while
>> + * creating records.
>> + *
>> + * A rte_flow_classify_type_set() with a valid type will register Rx/Tx
>> + * callbacks and start filling flow record table.
>> + * With rte_flow_classify_stats_get(), pointer sent to caller and meanwhile
>> + * library continues collecting records.
>> + *
>> + *  Usage:
>> + *  - application calls rte_flow_classify_type_set() for a device
>> + *  - library creates Rx/Tx callbacks for packets and start filling flow table
> 
> Does it necessary to use an  RX callback here?
> Can library provide an API like collect(port_id, input_mbuf[], pkt_num) instead?
> So the user would have a choice either setup a callback or call collect() directly. 

This was also comment from Morten, I will update RFC to use direct API call.

> 
>> + *    for that type of flow (currently only one flow type supported)
>> + *  - application calls rte_flow_classify_stats_get() to get pointer to linked
>> + *    listed flow table. Library assigns this pointer to another value and keeps
>> + *    collecting flow data. In next rte_flow_classify_stats_get(), library first
>> + *    free the previous table, and pass current table to the application, keep
>> + *    collecting data.
> 
> Ok, but that means that you can't use stats_get() for the same type
> from 2 different threads without explicit synchronization?

Correct.
And multiple threads shouldn't be calling this API. It doesn't store
previous flow data, so multiple threads calling this only can have piece
of information. Do you see any use case that multiple threads can call
this API?

> 
>> + *  - application calls rte_flow_classify_type_reset(), library unregisters the
>> + *    callbacks and free all flow table data.
>> + *
>> + */
>> +
>> +enum rte_flow_classify_type {
>> +	RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
>> +	RTE_FLOW_CLASSIFY_TYPE_MAX,
>> +};
>> +
>> +#define RTE_FLOW_CLASSIFY_TYPE_MASK = (((RTE_FLOW_CLASSIFY_TYPE_MAX - 1) << 1) - 1)
>> +
>> +/**
>> + * Global configuration struct
>> + */
>> +struct rte_flow_classify_config {
>> +	uint32_t type; /* bitwise enum rte_flow_classify_type values */
>> +	void *flow_table_prev;
>> +	uint32_t flow_table_prev_item_count;
>> +	void *flow_table_current;
>> +	uint32_t flow_table_current_item_count;
>> +} rte_flow_classify_config[RTE_MAX_ETHPORTS];
>> +
>> +#define RTE_FLOW_CLASSIFY_STAT_MAX UINT16_MAX
>> +
>> +/**
>> + * Classification stats data struct
>> + */
>> +struct rte_flow_classify_stat_generic {
>> +	struct rte_flow_classify_stat_generic *next;
>> +	uint32_t id;
>> +	uint64_t timestamp;
>> +
>> +	struct ether_addr src_mac;
>> +	struct ether_addr dst_mac;
>> +	uint32_t src_ipv4;
>> +	uint32_t dst_ipv4;
>> +	uint8_t l3_protocol_id;
>> +	uint16_t src_port;
>> +	uint16_t dst_port;
>> +
>> +	uint64_t packet_count;
>> +	uint64_t packet_size; /* bytes */
>> +};
> 
> Ok, so if I understood things right, for generic type it will always classify all incoming packets by:
> <src_mac, dst_mac, src_ipv4, dst_ipv4, l3_protocol_id, src_port, dst_port> 
> all by absolute values, and represent results as a linked list.
> Is that correct, or I misunderstood your intentions here?

Correct.

> If so, then I see several disadvantages here:
> 1) It is really hard to predict what kind of stats is required for that particular cases.
>  Let say some people would like to collect stat by <dst_mac,, vlan> ,
> another by <dst_ipv4,subnet_mask>, third ones by <l4 dst_port> and so on.
> Having just one hardcoded filter doesn't seem very felxable/usable.
> I think you need to find a way to allow user to define what type of filter they want to apply.

The flow type should be provided by applications, according their needs,
and needs to be implemented in this library. The generic one will be the
only one implemented in first version:
enum rte_flow_classify_type {
	RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
	RTE_FLOW_CLASSIFY_TYPE_MAX,
};


App should set the type first via the API:
rte_flow_classify_type_set(uint8_t port_id, uint32_t type);


And the stats for this type will be returned, because returned type can
be different type of struct, returned as void:
rte_flow_classify_stats_get(uint8_t port_id, void *stats);

> I think it was discussed already, but I still wonder why rte_flow_item can't be used for that approach?


> 2) Even  one 10G port can produce you ~14M rte_flow_classify_stat_generic entries in one second
> (all packets have different ipv4/ports or so).
> Accessing/retrieving items over linked list with 14M entries - doesn't sound like a good idea.
> I'd say we need some better way to retrieve/present collected data.

This is to keep flows, so I expect the numbers will be less comparing to
the packet numbers.
It is possible to use fixed size arrays for this. But I think it is easy
to make this switch later, I would like to see the performance effect
before doing this switch. Do you think is it OK to start like this and
give that decision during implementation?


More information about the dev mailing list