[dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API

Chandran, Sugesh sugesh.chandran at intel.com
Tue Dec 6 19:11:38 CET 2016


Hi Adrien,
Thanks for sending out the patches,

Please find few comments below,


Regards
_Sugesh


> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Kevin Traynor
> Sent: Friday, December 2, 2016 9:07 PM
> To: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
> Cc: dev at dpdk.org; Thomas Monjalon <thomas.monjalon at 6wind.com>; De
> Lara Guarch, Pablo <pablo.de.lara.guarch at intel.com>; Olivier Matz
> <olivier.matz at 6wind.com>; sugesh.chandran at intel.comn
> Subject: Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> 
>>>>>>Snipp
> >>> + *
> >>> + * Attaches a 32 bit value to packets.
> >>> + *
> >>> + * This value is arbitrary and application-defined. For
> >>> +compatibility with
> >>> + * FDIR it is returned in the hash.fdir.hi mbuf field.
> >>> +PKT_RX_FDIR_ID is
> >>> + * also set in ol_flags.
> >>> + */
> >>> +struct rte_flow_action_mark {
> >>> +	uint32_t id; /**< 32 bit value to return with packets. */ };
> >>
> >> One use case I thought we would be able to do for OVS is
> >> classification in hardware and the unique flow id is sent with the packet to
> software.
> >> But in OVS the ufid is 128 bits, so it means we can't and there is
> >> still the miniflow extract overhead. I'm not sure if there is a
> >> practical way around this.
> >>
> >> Sugesh (cc'd) has looked at this before and may be able to comment or
> >> correct me.
> >
> > Yes, we settled on 32 bit because currently no known hardware
> > implementation supports more than this. If that changes, another
> > action with a larger type shall be provided (no ABI breakage).
> >
> > Also since even 64 bit would not be enough for the use case you
> > mention, there is no choice but use this as an indirect value (such as
> > an array or hash table index/value).
> 
> ok, cool. I think Sugesh has other ideas anyway!
[Sugesh] It should be fine with 32 bit . we can manage it in OVS accordingly.
> 
> >
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_RSS
> >>> + *
> >>> +
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_vf {
> >>> +	uint32_t original:1; /**< Use original VF ID if possible. */
> >>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> >>> +	uint32_t id; /**< VF ID to redirect packets to. */ };
> > [...]
> >>> +/**
> >>> + * Check whether a flow rule can be created on a given port.
> >>> + *
> >>> + * While this function has no effect on the target device, the flow
> >>> +rule is
> >>> + * validated against its current configuration state and the
> >>> +returned value
> >>> + * should be considered valid by the caller for that state only.
> >>> + *
> >>> + * The returned value is guaranteed to remain valid only as long as
> >>> +no
> >>> + * successful calls to rte_flow_create() or rte_flow_destroy() are
> >>> +made in
> >>> + * the meantime and no device parameter affecting flow rules in any
> >>> +way are
> >>> + * modified, due to possible collisions or resource limitations
> >>> +(although in
> >>> + * such cases EINVAL should not be returned).
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 if flow rule is valid and can be created. A negative errno value
> >>> + *   otherwise (rte_errno is also set), the following errors are defined:
> >>> + *
> >>> + *   -ENOSYS: underlying device does not support this functionality.
> >>> + *
> >>> + *   -EINVAL: unknown or invalid rule specification.
> >>> + *
> >>> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> >>> + *   bit-masks are unsupported).
> >>> + *
> >>> + *   -EEXIST: collision with an existing rule.
> >>> + *
> >>> + *   -ENOMEM: not enough resources.
> >>> + *
> >>> + *   -EBUSY: action cannot be performed due to busy device resources,
> may
> >>> + *   succeed if the affected queues or even the entire port are in a
> stopped
> >>> + *   state (see rte_eth_dev_rx_queue_stop() and
> rte_eth_dev_stop()).
> >>> + */
> >>> +int
> >>> +rte_flow_validate(uint8_t port_id,
> >>> +		  const struct rte_flow_attr *attr,
> >>> +		  const struct rte_flow_item pattern[],
> >>> +		  const struct rte_flow_action actions[],
> >>> +		  struct rte_flow_error *error);
> >>
> >> Why not just use rte_flow_create() and get an error? Is it less
> >> disruptive to do a validate and find the rule cannot be created, than
> >> using a create directly?
> >
> > The rationale can be found in the original RFC, which I'll convert to
> > actual documentation in v2. In short:
> >
> > - Calling rte_flow_validate() before rte_flow_create() is useless since
> >   rte_flow_create() also performs validation.
> >
> > - We cannot possibly express a full static set of allowed flow rules, even
> >   if we could, it usually depends on the current hardware configuration
> >   therefore would not be static.
> >
> > - rte_flow_validate() is thus provided as a replacement for capability
> >   flags. It can be used to determine during initialization if the underlying
> >   device can support the typical flow rules an application might want to
> >   provide later and do something useful with that information (e.g. always
> >   use software fallback due to HW limitations).
> >
> > - rte_flow_validate() being a subset of rte_flow_create(), it is essentially
> >   free to expose.
> 
> make sense now, thanks.
[Sugesh] : We had this discussion earlier at the design stage about the time taken for programming the hardware,
and how to make it deterministic. How about having a timeout parameter as well for the rte_flow_*
If the hardware flow insert is timed out, error out than waiting indefinitely, so that application have some control over
The time to program the flow. It can be another set of APIs something like, rte_flow_create_timeout()

Are you going to provide any control over the initialization of NIC  to define the capability matrices
For eg; To operate in a L3 router mode,  software wanted to initialize the NIC port only to consider the L2 and L3 fields.
I assume the initialization is done based on the first rules that are programmed into the NIC.?
> 
> >
> >>> +
> >>> +/**
> >>> + * Create a flow rule on a given port.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   A valid handle in case of success, NULL otherwise and rte_errno is
> set
> >>> + *   to the positive version of one of the error codes defined for
> >>> + *   rte_flow_validate().
> >>> + */
> >>> +struct rte_flow *
> >>> +rte_flow_create(uint8_t port_id,
> >>> +		const struct rte_flow_attr *attr,
> >>> +		const struct rte_flow_item pattern[],
> >>> +		const struct rte_flow_action actions[],
> >>> +		struct rte_flow_error *error);
> >>
> >> General question - are these functions threadsafe? In the OVS example
> >> you could have several threads wanting to create flow rules at the
> >> same time for same or different ports.
> >
> > No they aren't, applications have to perform their own locking. The
> > RFC (to be converted to actual documentation in v2) says that:
> >
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> >
> > - There is no provision for reentrancy/multi-thread safety, although
> nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
> 
> other comment above wrt locking.
> 
> >


More information about the dev mailing list