[dpdk-dev] [EXT] Re: [PATCH v7 1/3] ethdev: add level support for RSS offload types

Andrew Rybchenko arybchenko at solarflare.com
Mon Sep 7 10:12:02 CEST 2020


On 9/3/20 4:14 PM, Ferruh Yigit wrote:
> On 9/3/2020 11:11 AM, Kiran Kumar Kokkilagadda wrote:
>> *From:* Ajit Khaparde <ajit.khaparde at broadcom.com>
>> *Sent:* Tuesday, September 1, 2020 10:42 PM
>> *To:* Kiran Kumar Kokkilagadda <kirankumark at marvell.com>
>> *Cc:* Ferruh Yigit <ferruh.yigit at intel.com>; Thomas Monjalon
>> <thomas at monjalon.net>; Andrew Rybchenko <arybchenko at solarflare.com>;
>> dev at dpdk.org; Jerin Jacob Kollanukkaran <jerinj at marvell.com>;
>> orika at mellanox.com; xuanziyang2 at huawei.com;
>> cloud.wangxiaoyun at huawei.com; zhouguoyang at huawei.com;
>> rosen.xu at intel.com; beilei.xing at intel.com; jia.guo at intel.com; Rasesh
>> Mody <rmody at marvell.com>; Shahed Shaikh <shshaikh at marvell.com>; Nithin
>> Kumar Dabilpuram <ndabilpuram at marvell.com>; qiming.yang at intel.com;
>> qi.z.zhang at intel.com; keith.wiles at intel.com; hemant.agrawal at nxp.com;
>> sachin.saxena at nxp.com; wei.zhao1 at intel.com; johndale at cisco.com;
>> hyonkim at cisco.com; chas3 at att.com; matan at mellanox.com;
>> shahafs at mellanox.com; viacheslavo at mellanox.com;
>> rahul.lakkireddy at chelsio.com; grive at u256.net; Liron Himi
>> <lironh at marvell.com>; jingjing.wu at intel.com; xavier.huwei at huawei.com;
>> humin29 at huawei.com; yisen.zhuang at huawei.com;
>> somnath.kotur at broadcom.com; jasvinder.singh at intel.com;
>> cristian.dumitrescu at intel.com
>> *Subject:* Re: [EXT] Re: [dpdk-dev][PATCH v7 1/3] ethdev: add level
>> support for RSS offload types
>>
>> On Tue, Sep 1, 2020 at 7:27 AM Kiran Kumar Kokkilagadda
>> <kirankumark at marvell.com <mailto:kirankumark at marvell.com>> wrote:
>>
>>
>>
>>      > -----Original Message-----
>>      > From: Ferruh Yigit <ferruh.yigit at intel.com
>> <mailto:ferruh.yigit at intel.com>>
>>      > Sent: Tuesday, September 1, 2020 7:08 PM
>>      > To: Kiran Kumar Kokkilagadda <kirankumark at marvell.com
>>     <mailto:kirankumark at marvell.com>>; Thomas Monjalon
>>      > <thomas at monjalon.net <mailto:thomas at monjalon.net>>; Andrew
>> Rybchenko
>>     <arybchenko at solarflare.com <mailto:arybchenko at solarflare.com>>
>>      > Cc: dev at dpdk.org <mailto:dev at dpdk.org>; Jerin Jacob Kollanukkaran
>>     <jerinj at marvell.com <mailto:jerinj at marvell.com>>;
>>      > orika at mellanox.com <mailto:orika at mellanox.com>;
>> xuanziyang2 at huawei.com
>>     <mailto:xuanziyang2 at huawei.com>;
>>      > cloud.wangxiaoyun at huawei.com
>> <mailto:cloud.wangxiaoyun at huawei.com>;
>>     zhouguoyang at huawei.com <mailto:zhouguoyang at huawei.com>;
>>      > rosen.xu at intel.com <mailto:rosen.xu at intel.com>;
>> beilei.xing at intel.com
>>     <mailto:beilei.xing at intel.com>; jia.guo at intel.com
>>     <mailto:jia.guo at intel.com>; Rasesh Mody
>>      > <rmody at marvell.com <mailto:rmody at marvell.com>>; Shahed Shaikh
>>     <shshaikh at marvell.com <mailto:shshaikh at marvell.com>>; Nithin Kumar
>>      > Dabilpuram <ndabilpuram at marvell.com
>> <mailto:ndabilpuram at marvell.com>>;
>>     qiming.yang at intel.com <mailto:qiming.yang at intel.com>;
>>      > qi.z.zhang at intel.com <mailto:qi.z.zhang at intel.com>;
>> keith.wiles at intel.com
>>     <mailto:keith.wiles at intel.com>; hemant.agrawal at nxp.com
>>     <mailto:hemant.agrawal at nxp.com>;
>>      > sachin.saxena at nxp.com <mailto:sachin.saxena at nxp.com>;
>> wei.zhao1 at intel.com
>>     <mailto:wei.zhao1 at intel.com>; johndale at cisco.com
>> <mailto:johndale at cisco.com>;
>>      > hyonkim at cisco.com <mailto:hyonkim at cisco.com>; chas3 at att.com
>>     <mailto:chas3 at att.com>; matan at mellanox.com
>> <mailto:matan at mellanox.com>;
>>      > shahafs at mellanox.com <mailto:shahafs at mellanox.com>;
>>     viacheslavo at mellanox.com <mailto:viacheslavo at mellanox.com>;
>>      > rahul.lakkireddy at chelsio.com
>> <mailto:rahul.lakkireddy at chelsio.com>;
>>     grive at u256.net <mailto:grive at u256.net>; Liron Himi
>>      > <lironh at marvell.com <mailto:lironh at marvell.com>>;
>> jingjing.wu at intel.com
>>     <mailto:jingjing.wu at intel.com>; xavier.huwei at huawei.com
>>     <mailto:xavier.huwei at huawei.com>;
>>      > humin29 at huawei.com <mailto:humin29 at huawei.com>;
>> yisen.zhuang at huawei.com
>>     <mailto:yisen.zhuang at huawei.com>;
>>      > ajit.khaparde at broadcom.com <mailto:ajit.khaparde at broadcom.com>;
>>     somnath.kotur at broadcom.com <mailto:somnath.kotur at broadcom.com>;
>>      > jasvinder.singh at intel.com <mailto:jasvinder.singh at intel.com>;
>>     cristian.dumitrescu at intel.com <mailto:cristian.dumitrescu at intel.com>
>>      > Subject: [EXT] Re: [dpdk-dev][PATCH v7 1/3] ethdev: add level
>> support for RSS
>>      > offload types
>>      >
>>      > External Email
>>      >
>>      >
>> ----------------------------------------------------------------------
>>      > On 9/1/2020 4:27 AM, kirankumark at marvell.com
>>     <mailto:kirankumark at marvell.com> wrote:
>>      > > From: Kiran Kumar K <kirankumark at marvell.com
>>     <mailto:kirankumark at marvell.com>>
>>      > >
>>      > > This patch reserves 2 bits as input selection to select Inner
>> and
>>      > > outer encapsulation level for RSS computation. It is combined
>> with
>>      > > existing
>>      > > ETH_RSS_* to choose Inner or outer layers.
>>      > > This functionality already exists in rte_flow through level
>> parameter
>>      > > in RSS action configuration rte_flow_action_rss.
>>      > >
>>      > > Signed-off-by: Kiran Kumar K <kirankumark at marvell.com
>>     <mailto:kirankumark at marvell.com>>
>>      > > ---
>>      > > V7 Changes:
>>      > > * Re-worked to keep it in sync with rte_flow_action_rss and
>> support
>>      > > upto
>>      > > 3 levels.
>>      > > * Addressed testpmd review comments.
>>      > >
>>      > >   lib/librte_ethdev/rte_ethdev.h | 27
>> +++++++++++++++++++++++++++
>>      > >   1 file changed, 27 insertions(+)
>>      > >
>>      > > diff --git a/lib/librte_ethdev/rte_ethdev.h
>>      > > b/lib/librte_ethdev/rte_ethdev.h index 70295d7ab..13e49bbd7
>> 100644
>>      > > --- a/lib/librte_ethdev/rte_ethdev.h
>>      > > +++ b/lib/librte_ethdev/rte_ethdev.h
>>      > > @@ -552,6 +552,33 @@ struct rte_eth_rss_conf {
>>      > >   #define RTE_ETH_RSS_L3_PRE64         (1ULL << 53)
>>      > >   #define RTE_ETH_RSS_L3_PRE96         (1ULL << 52)
>>      > >
>>      > > +/*
>>      > > + * We use the following macros to combine with the above
>> layers to
>>      > > +choose
>>      > > + * inner and outer layers or both for RSS computation.
>>      > > + * bit 50 and 51 are reserved for this.
>>      > > + */
>>      > > +
>>      > > +/** level 0, requests the default behavior. Depending on the
>> packet
>>      > > + * type, it can mean outermost, innermost, anything in
>> between or even no
>>      > RSS.
>>      > > + * It basically stands for the innermost encapsulation level
>> RSS
>>      > > + * can be performed on according to PMD and device
>> capabilities.
>>      > > + */
>>      > > +#define ETH_RSS_LEVEL_0         (0ULL << 50)
>>      >
>>      > I can see from history how this is involved, but the
>> 'ETH_RSS_LEVEL_0'
>>     naming is
>>      > not really clear what it is, the naming in v6 is more clear.
>>      >
>>      > What about following one:
>>      > 0 -> LEVEL_PMD_DEFAULT
>>      > 1 -> LEVEL_OUTER
>>      > 2 -> LEVEL_INNER
>>      > 3 -> LEVEL_INNER_OUTER
>>      >
>>      > This doesn't exactly match to rte_flow one, but closer than v6
>> one. This ends
>>      > with max level 2. And defines a way to say both inner and outer.
>>
>>     This one looks good to me. If everyone is ok with the proposed
>> changes, I
>>     will send V8.
>>
>> How about following one:
>> 0 -> LEVEL_PMD_DEFAULT
>> 1 -> LEVEL_OUTERMOST
>> 2 -> LEVEL_INNERMOST
>>
>> This way we can avoid any ambiguity especially if stacked tunnel
>> headersbecome real.
>>
>>
>> 3 -> LEVEL_INNER_OUTER
>>
>> But I am not sure if INNER_OUTER has a use case.
>>
>> Alternatively,
>>
>> why not just add uint32_t level;
>>
>> just like in case of rte_flow_action_rss?
>>
>> It will break ABI but its 20.11.
>>
>> Thanks
>>
>> -Ajit
>>
>> Can I send V8 with this proposal?
>>
>> 0 -> LEVEL_PMD_DEFAULT
>> 1 -> LEVEL_OUTERMOST
>> 2 -> LEVEL_INNERMOST
>>
>> If anyone want INNER_OUTER, they can specify LEVEL_OUTERMOST|
>> LEVEL_INNERMOST
> 
> +1 to INNERMOST & OUTERMOST, and use "LEVEL_OUTERMOST| LEVEL_INNERMOST"
> for INNER_OUTER.

Frankly speaking I'd drop OUTERMOST | INNERMOST for now in requested RSS
hash config and defined OUTERMOST | INNERMOST in
capabilities as possibility to hash by either INNERMOST or
OUTERMOST headers correspondingly.

> 
> But the capability reporting is still problematic.
> If @Andrew has no objection, I think it is ok to have a v8 and we can
> continue discussion on it.

See above. Number of recognized tunnel levels could be reported
in dev_info, but looks insufficient, since it is interesting
which tunnels are supported (may be even on which level).

>>
>>
>>      >
>>      > > +
>>      > > +/** level 1,  requests RSS to be performed on the outermost
>> packet
>>      > > + * encapsulation level.
>>      > > + */
>>      > > +#define ETH_RSS_LEVEL_1         (1ULL << 50)
>>      > > +
>>      > > +/** level 2,  requests RSS to be performed on the
>>      > > + * specified inner packet encapsulation level, from
>> outermost to
>>      > > + * innermost (lower to higher values).
>>      > > + */
>>      > > +#define ETH_RSS_LEVEL_2            (2ULL << 50)
>>      >
>>      > I can see you are trying to copy rte_flow usage, but this
>> doesn't really
>>     makes
>>      > sense here. Where the value of the level is defined in this
>> case? If not
>>     defined
>>      > how the PMD knows which level to use?
>>      >
>>      > > +#define ETH_RSS_LEVEL_MASK (3ULL << 50)
>>      > > +
>>      > > +#define ETH_RSS_LEVEL(rss_hf) ((rss_hf & ETH_RSS_LEVEL_MASK)
>> >> 50)
>>      > > +
>>      > >   /**
>>      > >    * For input set change of hash filter, if SRC_ONLY and
>> DST_ONLY of
>>      > >    * the same level are used simultaneously, it is the same
>> case as
>>      > > --
>>      > > 2.25.1
>>      > >
>>



More information about the dev mailing list