[RFC v3 2/2] ethdev: add API to set process to active or standby

Rongwei Liu rongweil at nvidia.com
Wed Dec 21 13:05:01 CET 2022


Hi Jerin:

BR
Rongwei

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk at gmail.com>
> Sent: Wednesday, December 21, 2022 19:00
> To: Rongwei Liu <rongweil at nvidia.com>
> Cc: Matan Azrad <matan at nvidia.com>; Slava Ovsiienko
> <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>; Ferruh Yigit
> <ferruh.yigit at amd.com>; Andrew Rybchenko
> <andrew.rybchenko at oktetlabs.ru>; dev at dpdk.org; Raslan Darawsheh
> <rasland at nvidia.com>
> Subject: Re: [RFC v3 2/2] ethdev: add API to set process to active or standby
> 
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Dec 21, 2022 at 3:02 PM Rongwei Liu <rongweil at nvidia.com> wrote:
> >
> > HI Jerin:
> >
> 
> Hi Rongwei
> 
> > BR
> > Rongwei
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjacobk at gmail.com>
> > > Sent: Wednesday, December 21, 2022 17:13
> > > To: Rongwei Liu <rongweil at nvidia.com>
> > > Cc: Matan Azrad <matan at nvidia.com>; Slava Ovsiienko
> > > <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; NBU-Contact-
> > > Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>; Ferruh Yigit
> > > <ferruh.yigit at amd.com>; Andrew Rybchenko
> > > <andrew.rybchenko at oktetlabs.ru>; dev at dpdk.org; Raslan Darawsheh
> > > <rasland at nvidia.com>
> > > Subject: Re: [RFC v3 2/2] ethdev: add API to set process to active
> > > or standby
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On Wed, Dec 21, 2022 at 2:31 PM Rongwei Liu <rongweil at nvidia.com>
> wrote:
> > > >
> > > > Users may want to change the DPDK process to different versions
> > >
> > > Different version of DPDK? If there is any ABI change how to support this?
> > >
> > There is a new member which was introduced into rte_eth_dev_info but it
> shouldn’t be ABI breaking since using reserved fields.
> 
> That is just for rte_eth_dev_info. What about the ABI change in different
> ethdev structure and rte_flow structures across different DPDK ABI versions.
> 
Besides this, there is no other ABI changes dependency.

Assume there is a DPDK process A running with version v21.11 and plan to upgrade to
version v22.11. Let' call v22.11 as process B.

Now, process A has been running for long time and has lot of rules configured. It' "active" role per this API definition.
Process B starts and it should call this API and set itself to "standby" role and user can program the flow rules as they want
and different NIC vendors may have different recommendations. Nvidia suggests only program process B with group 0' rules now.

The user should sync all desired configurations from process A to process B, and process A starts to yield traffic like "delete all group 0
rules for Nvidia' NICs" or quit.
After that process B calls this API and set itself to "active" role, now the hot-upgrade finishes.

> > > > such as hot upgrade.
> > > > There is a strong requirement to simplify the logic and shorten
> > > > the traffic downtime as much as possible.
> > > >
> > > > This update introduces new rte_eth process role definitions:
> > > > active or standby.
> > > >
> > > > The active role means rules are programmed to HW immediately, and
> > > > no
> > >
> > > Why it has to be specific only to rte_flow rule? If it spedieic to
> > > rte_flow, why it is in rte_eth_process_ name space?
> > For now, this design focuses on the flow rule offloading and traffic
> redirection.
> > When switching process version, it' important to make sure which
> application receives and handles the traffic.
> 
> Changing the DPDK version runtime is just beyond rte_flow driver.

It' not about changing DPDK version but upgrading DPDK from one PMD version to another one.
Does the preceding example answer your question?
> 
> > The changing should be effective across all probing eth devices, that' why it
> was put under rte_eth_process_ (for all rte_eth_dev) name space.
> > >
> > > Also, if we are moving the standby, What about the rule whose ABI is
> > > changed between versions?
> >
> > Like the comments mentioned: " Before role transition, all the rules set by
> the active process should be flushed first. "
> 
> What happens to rte_flow flow handles for existing ones  which is created with
> version X?
> Also What if new version Y has ABI change in rte_flow_pattern and
> rte_flow_action structure?
> 
> For me, If DPDK version change is needed, simply reload the application. This
> API will soon bloat, and it will be a mess if to start handling Different DPDK
> version which is not ABI compatible at all.
> 
Yes, you are right. Reloading the application is the easiest way but it may have a long time
Window that traffic is lost. No traffic arrives at process A or process B. 
We are trying to simplify the reloading logic and minimize the traffic down time as much as possible.
The approach may differentiate hugely between different NIC vendors, so I think it should be better if 
DPDK can provide an abstract API.

If process A and process B are ABI different, it doesn't matter. 
1. Call this API with process A means older ABI.
2. Call this API with process B means newer ABI.
It' have process concept and working scope. 

> 
> 
> 
> > > > behavior changed. This is the default state.
> > > > The standby role means rules are queued in the HW. If no active
> > > > roles alive or back to active, the rules are effective immediately.
> > > >
> > > > Signed-off-by: Rongwei Liu <rongweil at nvidia.com>


More information about the dev mailing list